[Computer-go] Value network that doesn't want to learn.

Vincent Richard vincent.francois.richard at gmail.com
Fri Jun 23 02:56:59 PDT 2017


Finally found the problem. In the end, it was as stupid as expected:

When I pick a game for the batch creation I select randomly a limited 
number of moves inside the game. In the case of the value network I use 
like 8-16 moves to not overfit the data (I can't take 1 or then the I/O 
operations slow down the training) and for other networks, I would 
simply take all the moves. Or at least this was what I thought my code 
was doing. Instead of picking N random moves in the game, it was picking 
the first N moves in a random order. So... my value network was trained 
to tell me the game is balanced at the beginning...


Le 20-Jun-17 à 5:48 AM, Gian-Carlo Pascutto a écrit :
> On 19/06/2017 21:31, Vincent Richard wrote:
>> - The data is then analyzed by a script which extracts all kind of
>> features from games. When I'm training a network, I load the features I
>> want from this analysis to build the batch. I have 2 possible methods
>> for the batch construction. I can either add moves one after the other
>> (the fast mode) or pick random moves among different games (slower but
>> reduces the variance).
> You absolutely need the latter, especially as for outcome prediction the
> moves from the same game are not independent samples.
>
>> During sime of the tests, all the networks I was training had the same
>> layers except for the last. So as you suggested, I was also wondering if
>> this last layer wasn’t the problem. Yet, I haven’t found any error.
> ...
>> However, if I feed a stupid
>> value as target output (for example black always win) it has no trouble
>> learning.
> A problem with side to move/won side marking in the input or feature
> planes, or with the expected outcome (0 vs 1 vs -1)?
>



More information about the Computer-go mailing list