[Computer-go] CNN with 54% prediction on KGS 6d+ data

Wed Dec 9 04:59:22 PST 2015

Thank you for the feedback, everyone.

Regarding the CPU-GPU roundtrips, I'm wondering whether it'd be
possible to recursively apply the output matrix to the prior input
matrix to update board positions within the GPU and  without any
actual (possibly CPU-based) evaluation until all branches come up with
game ending states. I assume illegal moves would mostly fall away when
sticking to the top ten or top five move considerations provided by
the CNN.

As for performance, I could imagine initialization being relatively
slow, but wouldn't be surprised if the GPU-based CNN performance could
offer a branch size, running through many parallel boards with
comparatively minor performance impact, where this outweighed the
initial overhead again.

Whether this would provide a better evaluation function than MCTS I
don't know, but just like Alvaro I would love to see this tried, even
if just to rule it out for the moment.

I've got a GTX 980 Ti on a 4790k with 16 GB at home. For a low key
test I could run Windows (CUDA installed and running, tested with
pylearn2) or Ubuntu from a live setup on USB and would be willing to
run test code, if somebody provided a package I could simply download
and execute.

All the best

Michael

On Tue, Dec 8, 2015 at 7:52 PM, Álvaro Begué <alvaro.begue at gmail.com> wrote:
> Of course whether these "neuro-playouts" are any better than the heavy
> playouts currently being used by strong programs is an empirical question.
> But I would love to see it answered...
>
>
>
> On Tue, Dec 8, 2015 at 1:31 PM, David Ongaro <david.ongaro at hamburg.de>
> wrote:
>>
>> Did everyone forget the fact that stronger playouts don't necessarily lead
>> to an better evaluation function? (Yes, that what playouts essential are, a
>> dynamic evaluation function.) This is even under the assumption that we can
>> reach the same number of playouts per move.
>>
>>
>> On 08 Dec 2015, at 10:21, Álvaro Begué <alvaro.begue at gmail.com> wrote:
>>
>> I don't think the CPU-GPU communication is what's going to kill this idea.
>> The latency in actually computing the feed-forward pass of the CNN is going
>> to be in the order of 0.1 seconds (I am guessing here), which means
>> finishing the first playout will take many seconds.
>>
>> So perhaps it would be interesting to do something like this for
>> correspondence games, but not for regular games.
>>
>>
>> Álvaro.
>>
>>
>>
>> On Tue, Dec 8, 2015 at 12:03 PM, Petr Baudis <pasky at ucw.cz> wrote:
>>>
>>>   Hi!
>>>
>>>   Well, for this to be practical the entire playout would have to be
>>> executed on the GPU, with no round-trips to the CPU.  That's what my
>>> email was aimed at.
>>>
>>> On Tue, Dec 08, 2015 at 04:37:05PM +0000, Josef Moudrik wrote:
>>> > Regarding full CNN playouts, I think that problem is that a playout is
>>> > a
>>> > long serial process, given 200-300 moves a game. You need to construct
>>> > planes and transfer them to GPU for each move and read result back (at
>>> > least with current CNN implementations afaik), so my guess would be
>>> > that
>>> > such playout would take time in order of seconds. So there seems to be
>>> > a
>>> > tradeoff, CNN playouts are (probably much) better (at "playing better
>>> > games") than e.g. distribution playouts, but whether this is worth the
>>> > implied (probably much) lower height of the MC tree is a question.
>>> >
>>> > Maybe if you had really a lot of GPUs and very high thinking time, this
>>> > could be the way.
>>> >
>>> > Josef
>>> >
>>> > On Tue, Dec 8, 2015 at 5:17 PM Petr Baudis <pasky at ucw.cz> wrote:
>>> >
>>> > >   Hi!
>>> > >
>>> > >   In case someone is looking for a starting point to actually
>>> > > implement
>>> > > Go rules etc. on GPU, you may find useful:
>>> > >
>>> > >
>>> > >
>>> > > https://www.mail-archive.com/computer-go@computer-go.org/msg12485.html
>>> > >
>>> > >   I wonder if you can easily integrate caffe GPU kernels in another
>>> > > GPU
>>> > > kernel like this?  But without training, reimplementing the NN could
>>> > > be
>>> > > pretty straightforward.
>>> > >
>>> > > On Tue, Dec 08, 2015 at 04:53:14PM +0100, Michael Markefka wrote:
>>> > > > Hello Detlef,
>>> > > >
>>> > > > I've got a question regarding CNN-based Go engines I couldn't find
>>> > > > anything about on this list. As I've been following your posts
>>> > > > here, I
>>> > > > thought you might be the right person to ask.
>>> > > >
>>> > > > Have you ever tried using the CNN for complete playouts? I know
>>> > > > that
>>> > > > CNNs have been tried for move prediction, immediate scoring and
>>> > > > move
>>> > > > generation to be used in an MC evaluator, but couldn't find
>>> > > > anything
>>> > > > about CNN-based playouts.
>>> > > >
>>> > > > It might only be feasible to play out the CNN's first choice move
>>> > > > for
>>> > > > evaluation purposes, but considering how well the performance of
>>> > > > batch
>>> > > > sizes scales, especially on GPU-based CNN applications, it might be
>>> > > > possible to setup something like 10 candidate moves, 10 reply
>>> > > > candidate moves and then have the CNN play out the first choice
>>> > > > move
>>> > > > for those 100 board positions until the end and then sum up scores
>>> > > > again for move evaluation (and/or possibly apply some other tried
>>> > > > and
>>> > > > tested methods like minimax). Given that the number of 10 moves is
>>> > > > supposed to be illustrative rather than representative, other
>>> > > > configurations of depth and width in position generation and
>>> > > > evaluation would be possible.
>>> > > >
>>> > > > It feels like CNN can provide a very focused, high-quality width in
>>> > > > move generation, but it might also be possible to apply that
>>> > > > quality
>>> > > > to depth of evaluation.
>>> > > >
>>> > > > Any thoughts to share?
>>> > > >
>>> > > >
>>> > > > All the best
>>> > > >
>>> > > > Michael
>>> > > >
>>> > > > On Tue, Dec 8, 2015 at 4:13 PM, Detlef Schmicker <ds2 at physik.de>
>>> > > > wrote:
>>> > > > > -----BEGIN PGP SIGNED MESSAGE-----
>>> > > > > Hash: SHA1
>>> > > > >
>>> > > > > Hi,
>>> > > > >
>>> > > > > as somebody ask I will offer my actual CNN for testing.
>>> > > > >
>>> > > > > It has 54% prediction on KGS 6d+ data (which I thought would be
>>> > > > > state
>>> > > > > of the art when I started training, but it is not anymore:).
>>> > > > >
>>> > > > > it has:
>>> > > > > 1
>>> > > > > 2
>>> > > > > 3
>>> > > > >> 4 libs playing color
>>> > > > > 1
>>> > > > > 2
>>> > > > > 3
>>> > > > >> 4 libs opponent color
>>> > > > > Empty points
>>> > > > > last move
>>> > > > > second last move
>>> > > > > third last move
>>> > > > > forth last move
>>> > > > >
>>> > > > > input layers, and it is fully convolutional, so with just editing
>>> > > > > the
>>> > > > > golast19.prototxt file you can use it for 13x13 as well, as I did
>>> > > > > on
>>> > > > > last sunday. It was used in November tournament as well.
>>> > > > >
>>> > > > > You can find it
>>> > > > > http://physik.de/CNNlast.tar.gz
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > If you try here some points I like to get discussion:
>>> > > > >
>>> > > > > - - it seems to me, that the playouts get much more important
>>> > > > > with such
>>> > > > > a strong move prediction. Often the move prediction seems better
>>> > > > > the
>>> > > > > playouts (I use 8000 at the moment against pachi 32000 with about
>>> > > > > 70%
>>> > > > > winrate on 19x19, but with an extremely focused progressive
>>> > > > > widening
>>> > > > > (a=400, a=20 was usual).
>>> > > > >
>>> > > > > - - live and death becomes worse. My interpretation is, that the
>>> > > > > strong
>>> > > > > CNN does not play moves, which obviously do not help to get a
>>> > > > > group
>>> > > > > life, but would help the playouts to recognize the group is dead.
>>> > > > > (http://physik.de/example.sgf top black group was with weaker
>>> > > > > move
>>> > > > > prediction read very dead, with good CNN it was 30% alive or so
>>> > > > > :(
>>> > > > >
>>> > > > >
>>> > > > > OK, hope you try it, as you know our engine oakfoam is open
>>> > > > > source :)
>>> > > > > We just merged all the CNN stuff into the main branch!
>>> > > > > https://bitbucket.org/francoisvn/oakfoam/wiki/Home
>>> > > > > http://oakfoam.com
>>> > > > >
>>> > > > >
>>> > > > > Do the very best with the CNN
>>> > > > >
>>> > > > > Detlef
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > code:
>>> > > > > if (col==Go::BLACK) {
>>> > > > >           for (int j=0;j<size;j++)
>>> > > > >             for (int k=0;k<size;k++)
>>> > > > >                   {
>>> > > > >         for (int l=0;l<caffe_test_net_input_dim;l++)
>>> > > > > data[l*size*size+size*j+k]=0;
>>> > > > >         //fprintf(stderr,"%d %d %d\n",i,j,k);
>>> > > > >         int pos=Go::Position::xy2pos(j,k,size);
>>> > > > >         int libs=0;
>>> > > > >         if (board->inGroup(pos))
>>> > > > > libs=board->getGroup(pos)->numRealLibs()-1;
>>> > > > >         if (libs>3) libs=3;
>>> > > > >         if (board->getColor(pos)==Go::BLACK)
>>> > > > >                   {
>>> > > > >                           data[(0+libs)*size*size + size*j +
>>> > > > > k]=1.0;
>>> > > > >                           //data[size*size+size*j+k]=0.0;
>>> > > > >                           }
>>> > > > >               else if (board->getColor(pos)==Go::WHITE)
>>> > > > >                       {
>>> > > > >                           //data[j*size+k]=0.0;
>>> > > > >                           data[(4+libs)*size*size + size*j +
>>> > > > > k]=1.0;
>>> > > > >                           }
>>> > > > >               else if
>>> > > > > (board->getColor(Go::Position::xy2pos(j,k,size))==Go::EMPTY)
>>> > > > >               {
>>> > > > >                             data[8*size*size + size*j + k]=1.0;
>>> > > > >                           }
>>> > > > >             }
>>> > > > >         }
>>> > > > >         if (col==Go::WHITE) {
>>> > > > >           for (int j=0;j<size;j++)
>>> > > > >             for (int k=0;k<size;k++)
>>> > > > >                   {//fprintf(stderr,"%d %d %d\n",i,j,k);
>>> > > > >         for (int l=0;l<caffe_test_net_input_dim;l++)
>>> > > > > data[l*size*size+size*j+k]=0;
>>> > > > >         //fprintf(stderr,"%d %d %d\n",i,j,k);
>>> > > > >         int pos=Go::Position::xy2pos(j,k,size);
>>> > > > >         int libs=0;
>>> > > > >         if (board->inGroup(pos))
>>> > > > > libs=board->getGroup(pos)->numRealLibs()-1;
>>> > > > >         if (libs>3) libs=3;
>>> > > > >         if (board->getColor(pos)==Go::BLACK)
>>> > > > >                   {
>>> > > > >                           data[(4+libs)*size*size + size*j +
>>> > > > > k]=1.0;
>>> > > > >                           //data[size*size+size*j+k]=0.0;
>>> > > > >                           }
>>> > > > >               else if (board->getColor(pos)==Go::WHITE)
>>> > > > >                       {
>>> > > > >                           //data[j*size+k]=0.0;
>>> > > > >                           data[(0+libs)*size*size + size*j +
>>> > > > > k]=1.0;
>>> > > > >                           }
>>> > > > >               else if (board->getColor(pos)==Go::EMPTY)
>>> > > > >               {
>>> > > > >                             data[8*size*size + size*j + k]=1.0;
>>> > > > >                           }
>>> > > > >     }
>>> > > > >         }
>>> > > > > if (caffe_test_net_input_dim > 9) {
>>> > > > >   if (board->getLastMove().isNormal()) {
>>> > > > >     int
>>> > > > > j=Go::Position::pos2x(board->getLastMove().getPosition(),size);
>>> > > > >     int
>>> > > > > k=Go::Position::pos2y(board->getLastMove().getPosition(),size);
>>> > > > >     data[9*size*size+size*j+k]=1.0;
>>> > > > >   }
>>> > > > >   if (board->getSecondLastMove().isNormal()) {
>>> > > > >     int
>>> > > > >
>>> > > > > j=Go::Position::pos2x(board->getSecondLastMove().getPosition(),size);
>>> > > > >     int
>>> > > > >
>>> > > > > k=Go::Position::pos2y(board->getSecondLastMove().getPosition(),size);
>>> > > > >     data[10*size*size+size*j+k]=1.0;
>>> > > > >   }
>>> > > > >   if (board->getThirdLastMove().isNormal()) {
>>> > > > >     int
>>> > > > >
>>> > > > > j=Go::Position::pos2x(board->getThirdLastMove().getPosition(),size);
>>> > > > >     int
>>> > > > >
>>> > > > > k=Go::Position::pos2y(board->getThirdLastMove().getPosition(),size);
>>> > > > >     data[11*size*size+size*j+k]=1.0;
>>> > > > >   }
>>> > > > >   if (board->getForthLastMove().isNormal()) {
>>> > > > >     int
>>> > > > >
>>> > > > > j=Go::Position::pos2x(board->getForthLastMove().getPosition(),size);
>>> > > > >     int
>>> > > > >
>>> > > > > k=Go::Position::pos2y(board->getForthLastMove().getPosition(),size);
>>> > > > >     data[12*size*size+size*j+k]=1.0;
>>> > > > >   }
>>> > > > > }
>>> > > > >
>>> > > > > -----BEGIN PGP SIGNATURE-----
>>> > > > > Version: GnuPG v2.0.22 (GNU/Linux)
>>> > > > >
>>> > > > > iQIcBAEBAgAGBQJWZvOlAAoJEInWdHg+Znf4t8cP/2a9fE7rVb3Hz9wvdMkvVkFS
>>> > > > > 4Y3AomVx8i56jexVyXuzKihfizVRM7x6lBiwjYBhj4Rm9UFWjj2ZvDzBGCm3Sy4I
>>> > > > > SpG8D01VnzVR6iC1YTu3ecv9Wo4pTjc7NL5pAxiZDB0V7OTRklfZAYsX4mWyHygn
>>> > > > > cr1pIb79/9QfBf/johmuutXJIwYfVG9ShR1+udbxs3aU3QDAbJJ4eTs8oj+NqFpg
>>> > > > > JolEEEg3wY693e77SqbUbjxR3kSsysoz9h1nKnR/ZjHByqlwNvSz9ho9eU0rKhaK
>>> > > > > GSQ22/c1VPIZhr24FYBbYNYweOzDtonLpuUFCPSnYVels3h/I/LlqV3MeDo6wuZ2
>>> > > > > QCPp5+11o4JzvEt7A4zfJCtEOEH0W2/+IjRcIkAVOo65OV/pPsz2EjHehMU6PC6m
>>> > > > > vXA/kPx0jqUm1qSb0qCgMq5ZvSqfpcCY7JOlkEwkDBS1fty9sU0hqst3zXR0KGtn
>>> > > > > rFuoREmQYi/mkjZfS2Q4AHiZUDbDZUKzRegUA+gR/eKAmJsmWeTDEI9ZAXgxL0cB
>>> > > > > p1HGBNDEUKGk+ruq0gIe5vYygyBcJV0BbbBnweDjeZnlG8vLUAVoMF6V/q3gkZb1
>>> > > > > P61rfE4d9dohfGBsZ+UWltRyWMj09ieR2G2zCDpIXyxEuoV6CTAlLzDuhmqFa2ma
>>> > > > > Fp3lK/uLhOucXwBtStdx
>>> > > > > =E47K
>>> > > > > -----END PGP SIGNATURE-----
>>> > > > > _______________________________________________
>>> > > > > Computer-go mailing list
>>> > > > > Computer-go at computer-go.org
>>> > > > > http://computer-go.org/mailman/listinfo/computer-go
>>> > > > _______________________________________________
>>> > > > Computer-go mailing list
>>> > > > Computer-go at computer-go.org
>>> > > > http://computer-go.org/mailman/listinfo/computer-go
>>> > >
>>> > > --
>>> > >                                 Petr Baudis
>>> > >         If you have good ideas, good data and fast computers,
>>> > >         you can do almost anything. -- Geoffrey Hinton
>>> > > _______________________________________________
>>> > > Computer-go mailing list
>>> > > Computer-go at computer-go.org
>>> > > http://computer-go.org/mailman/listinfo/computer-go
>>>
>>> > _______________________________________________
>>> > Computer-go mailing list
>>> > Computer-go at computer-go.org
>>> > http://computer-go.org/mailman/listinfo/computer-go
>>>
>>>
>>> --
>>>                                 Petr Baudis
>>>         If you have good ideas, good data and fast computers,
>>>         you can do almost anything. -- Geoffrey Hinton
>>> _______________________________________________
>>> Computer-go mailing list
>>> Computer-go at computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go at computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go at computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go