[Computer-go] semeai example of winning rate
Brian Sheppard
sheppardco at aol.com
Wed Jan 19 06:12:56 PST 2011
>Blitz games may give fast but wrong results, penalizing a new patch for
lack of speed.
My experience is exactly the opposite: improvements make the program
*slower* but stronger.
The risk to scalability is that we will bias the search by focusing on
variations that a blitz program cannot discover, but a massively scalable
system could.
We may have a historical instance of this already: the "fillboard" policy of
Mogo. Mogo's fillboard playout policy is to randomly select N points on the
board. (Typically, N = 1 or 2.) If any of them are empty and surrounded by
empty points, then the playout will play that move. Various teams have
reported negative results from this policy. The Mogo team has stated that
fillboard did not work at 9x9, but was important at 19x19, and that it was
positive value at high scalability levels.
Another possible instance: Pachi's playout policy. Pachi has conditioned
each generator in the playout policy on a probability weight. For example,
the rule that says "play around the last point if you see this pattern" is
now only executed with a certain probability. IIRC, they report that
executing each rule with 90% probability is marginally better than using
100%. I am pretty sure that deep and shallow searchers can differ on this. A
deep search can afford to explore because the MCTS tree is large and will
sort things out, whereas a shallow tree is better off gambling that its rule
is correct.
This research area has many counter-intuitive and contradictory results.
Researchers really have to keep an open mind and test a lot of variations.
Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20110119/9c0653d8/attachment.html>
More information about the Computer-go
mailing list