[Computer-go] Fwd: News on Tromp-Cook ?
Aja
ajahuang at gmail.com
Sat Jan 1 08:16:28 PST 2011
Hi Fuming,
Most of the current strong programs are using UCT combined with RAVE (a kind of AMAF). The formula is like this (there are many variants),
C*RAVE+(1-C)*UCT
C is the weight of RAVE. As far as I know, there are at least two useful formula to compute C:
1. The first formula was proposed in the famous paper "Combining Online and Offline Knowledge in UCT" of Sylvain G. and David S (please refer to Section 6).(http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf).
2. The second (newer) was posted here in the past. Formal reference will be David Silver's PhD thesis. This new formula, according to my testing, is 70 elo stronger than the first one.
RAVE is really a big invention. It's a big contribution of Mogo. We must thank Sylvain and David for bringing such powerful method to us. :)
Aja
----- Original Message -----
From: Fuming Wang
To: computer-go at dvandva.org
Sent: Saturday, January 01, 2011 10:16 PM
Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ?
So, the current strong programs are more like AMAF instead of UCT, right?
Fuming
On Sat, Jan 1, 2011 at 11:32 AM, David Fotland <fotland at smart-games.com> wrote:
I still have a UCB term, but it's probably because I depend more on Many
Face's move generator. I have a rave term, but it's contribution is small.
It seems that if the RAVE term is large, then Rave creates enough
exploration by itself.
David
> -----Original Message-----
> From: computer-go-bounces at dvandva.org [mailto:computer-go-
> bounces at dvandva.org] On Behalf Of Petr Baudis
> Sent: Friday, December 31, 2010 6:27 PM
> To: computer-go at dvandva.org
> Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ?
>
> Hi!
>
> On Fri, Dec 31, 2010 at 07:02:35PM +0800, Fuming Wang wrote:
> > Now I know Remi is the first to utilize MCTS. Guess I need to read
> papers
> > more carefully. I do have a question though. I thought UCT is the
> foundation
> > of the current strong programs, I know that a RAVE term is added to the
> > original UCB term, i.e. sqrt(t_total/t_i), but the UCB term is still
> there
> > right? Could you eleborate a bit on why do you say "UCT is not good for
> Go"?
> > This is quite contradictory to a lot of material on the internet
> regarding
> > the lastest bread of go programs.
>
> Most likely not all (e.g. it seems not ManyFaces?), but at least many
> programs use exploration coefficients that are either zero or negligibly
> small.
>
> In Pachi, I'm using 0 as the exploration coefficient in the end, it
> seems to work the best. But this probably also depends on the fact that
> I have slight forceful randomization of playouts. 0.02 can work well on
> 9x9 too, but it also depends on the priors, etc.
>
> Overally, it is a question of the overall tuning of the program. But
> right now, reasonably strong play with only RAVE and no UCB1 is
> certainly possible.
>
> --
> Petr "Pasky" Baudis
> Computer science education cannot make an expert programmer any more
> than studying brushes and pigment can make an expert painter. --esr
> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go at dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
------------------------------------------------------------------------------
_______________________________________________
Computer-go mailing list
Computer-go at dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20110102/b2aad11e/attachment.html>
More information about the Computer-go
mailing list