[Computer-go] useless ko threats
Jacques BasaldĂșa
jacques at dybot.com
Wed Mar 7 03:27:14 PST 2012
Rémi Coulom wrote:
> Accelerated UCT does this:
https://www.conftool.net/acg13/index.php/Hashimoto-Accelerated_UCT_and_Its_A
pplication_to_Two-Player_Games-111.pdf?page=downloadPaper
<https://www.conftool.net/acg13/index.php/Hashimoto-Accelerated_UCT_and_Its_
Application_to_Two-Player_Games-111.pdf?page=downloadPaper&filename=Hashimot
o-Accelerated_UCT_and_Its_Application_to_Two-Player_Games-111.pdf&form_id=11
1&form_version=final>
&filename=Hashimoto-Accelerated_UCT_and_Its_Application_to_Two-Player_Games-
111.pdf&form_id=111&form_version=final
This idea was mentioned, circa 2009, on this list. It seemed intuitively
right that giving more weight to most recent results should improve play. I
implemented it pretty much like the author of the paper in 2009 and it is
still a configurable option in my program. I also used simulated sources of
results. In simulation it became clear that it was working fine and the
"learning" evaluation was a much better estimator of the final value than
the "non learning" (In my implementation it is called "estimate trend").
When playing games, not only it didn't work but it makes the program clearly
weaker. Even constants resulting in a very slow learning can lose 10 to 20
Elo points. No value has ever made it stronger, at best I can fade it out
completely making it irrelevant. And I have tested it more than once,
because I believed in it, as the program has evolved for double digit kyu to
3-4 kyu, always with negative results. Has Someone else tried it? I am still
interested in understanding why it doesn't work (for me) as it seems a good
idea.
Jacques.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20120307/731f8849/attachment.html>
More information about the Computer-go
mailing list