[Computer-go] Exploration formulas for UCT

Yamato yamato_cg at yahoo.co.jp
Sat Jan 1 21:09:02 PST 2011


Aja wrote:
>>  We use the Silver formula:
>>
>> rave_visits / (rave_visits + real_visits + rave_visits * real_visits * 
>> 3000)
>>
>> The figure of 3000 is surprisingly resilient. Even with radically
>> different heuristics and playouts, it stays the empirical optimum.
>
>   Interesting. According to Sylvain's original post here, that means you 
>set bias to sqrt(3000/4)=27.386... But is not bias should be in the range 
>[0,1]?

I guess it should be not "* 3000" but "/ 3000".

Zen also uses this type of formula, but the constant value is rather
small. I use 400 for the latest version of Zen.

--
Yamato



More information about the Computer-go mailing list