[Computer-go] Value network that doesn't want to learn.
Vincent Richard
vincent.francois.richard at gmail.com
Fri Jun 23 02:56:59 PDT 2017
Finally found the problem. In the end, it was as stupid as expected:
When I pick a game for the batch creation I select randomly a limited
number of moves inside the game. In the case of the value network I use
like 8-16 moves to not overfit the data (I can't take 1 or then the I/O
operations slow down the training) and for other networks, I would
simply take all the moves. Or at least this was what I thought my code
was doing. Instead of picking N random moves in the game, it was picking
the first N moves in a random order. So... my value network was trained
to tell me the game is balanced at the beginning...
Le 20-Jun-17 à 5:48 AM, Gian-Carlo Pascutto a écrit :
> On 19/06/2017 21:31, Vincent Richard wrote:
>> - The data is then analyzed by a script which extracts all kind of
>> features from games. When I'm training a network, I load the features I
>> want from this analysis to build the batch. I have 2 possible methods
>> for the batch construction. I can either add moves one after the other
>> (the fast mode) or pick random moves among different games (slower but
>> reduces the variance).
> You absolutely need the latter, especially as for outcome prediction the
> moves from the same game are not independent samples.
>
>> During sime of the tests, all the networks I was training had the same
>> layers except for the last. So as you suggested, I was also wondering if
>> this last layer wasn’t the problem. Yet, I haven’t found any error.
> ...
>> However, if I feed a stupid
>> value as target output (for example black always win) it has no trouble
>> learning.
> A problem with side to move/won side marking in the input or feature
> planes, or with the expected outcome (0 vs 1 vs -1)?
>
More information about the Computer-go
mailing list