[Computer-go] pachi2

Jean-loup Gailly jloup at gailly.net
Thu Jan 13 03:20:35 PST 2011


Here is some preliminary information on the distributed version of pachi.
Petr (pasky) and I will publish all the details later, this is just to give
you an idea of what we are doing. Pasky is the main author of pachi and
wrote most of the single machine code. I wrote the distributed code and
some other improvements.

All the code, including the distributed code, is GPL and available at
http://repo.or.cz/w/pachi.git/

The distributed pachi uses simple tcp/ip sockets, not MPI. This makes it
portable to many environments. A master process receives stats updates
regularly from all the slaves and distributes the aggregated updates back
to all slaves. The master-slave protocol is specific to pachi but it is
rather simple. It is fault tolerant: if a slave dies, the master will send
again the whole game to the new slave that will replace it. If the master
dies, I ignore the current game and restart a new one when doing test
runs. If the master dies when running for KGS, I kill the kgsGtp program
and start a new one; KGS then sends again the partial game and we continue
from there.

I measured scalability both on a single machine and in distributed
mode. All the details will be published, but here is a summary.  In single
machine mode, doubling the number of cores gains roughly 100 elo or one
stone. (I measured one stone to be approximately 100 elo).  This is true up
to the number of cores I can test (20 per machine, other cores are reserved
for the OS and other apps).

In distributed mode doubling the number of machines initially gains
approximately 50 elo (half a stone) up to 8 machines.  Above this we
quickly hit a scalability limit and the best result so far is with 64
machines; this is the configuration used for the KGS tournament (starting
at round 4) and on KGS right now. 128 machines are currently much worse
than 64.

Preliminary analysis of the lost games shows that the current code
has inherent scalability limits because the playouts are biased.
When the playouts incorrectly judge the life status of a group,
the results will be bad no matter how many cores and machines
work on it. We are of course working on this to eliminate these
scalability limits.

Pachi has benefited enormously from ideas published on the computer-go
mailing list and in many papers.  By making its source completely open we
hope to encourage further progress in this area.

Petr and Jean-loup
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20110113/e5eb5354/attachment.html>


More information about the Computer-go mailing list