[Computer-go] Exploration formulas for UCT
Yamato
yamato_cg at yahoo.co.jp
Sat Jan 1 21:09:02 PST 2011
Aja wrote:
>> We use the Silver formula:
>>
>> rave_visits / (rave_visits + real_visits + rave_visits * real_visits *
>> 3000)
>>
>> The figure of 3000 is surprisingly resilient. Even with radically
>> different heuristics and playouts, it stays the empirical optimum.
>
> Interesting. According to Sylvain's original post here, that means you
>set bias to sqrt(3000/4)=27.386... But is not bias should be in the range
>[0,1]?
I guess it should be not "* 3000" but "/ 3000".
Zen also uses this type of formula, but the constant value is rather
small. I use 400 for the latest version of Zen.
--
Yamato
More information about the Computer-go
mailing list