[Computer-go] Computer-go Digest, Vol 12, Issue 79
Hendrik Baier
hendrik.baier at googlemail.com
Wed Jan 26 01:00:47 PST 2011
Hi Aja,
I would be interested in your results. I think the LGRF policy is only a
small first step into the direction of more adaptive playouts (and
hopefully the overcoming of the horizon effect).
As for the Last-Bad-Reply idea, you can read about my experiences with
this and related policies in my Master's thesis, if you're interested.
It contains the idea that resulted in the "Power of Forgetting" paper as
well.
http://www.ke.tu-darmstadt.de/lehre/arbeiten/master/2010/Baier_Hendrik.pdf
regards,
Hendrik
> I admit that it's difficult for me to include such deterministic default policy. :-)
> With softmax policy, using the information of "last-LOST-reply" is maybe a good direction.
>
> Aja
More information about the Computer-go
mailing list