[Computer-go] Computer-go Digest, Vol 12, Issue 79

Wed Jan 26 01:00:47 PST 2011

Hi Aja,

I would be interested in your results. I think the LGRF policy is only a 
small first step into the direction of more adaptive playouts (and 
hopefully the overcoming of the horizon effect).
As for the Last-Bad-Reply idea, you can read about my experiences with 
this and related policies in my Master's thesis, if you're interested. 
It contains the idea that resulted in the "Power of Forgetting" paper as 
well.
http://www.ke.tu-darmstadt.de/lehre/arbeiten/master/2010/Baier_Hendrik.pdf

regards,
Hendrik

> I admit that it's difficult for me to include such deterministic default policy. :-)
> With softmax policy, using the information of "last-LOST-reply" is maybe a good direction.
>
> Aja