[Computer-go] replacing dynamic komi with a scoring function

Mon Jan 9 08:07:31 PST 2012

A very insightful post, I enjoyed reading it and I think it does make some
sense.    It's clear that a lot of energy is wasted on playouts when 99% of
them are ending with the same result.

Don

On Mon, Jan 9, 2012 at 10:26 AM, Vlad Dumitrescu <vladdu55 at gmail.com> wrote:

> Hi,
>
> On Mon, Jan 9, 2012 at 13:17, Don Dailey <dailey.don at gmail.com> wrote:
> > Summary:
> >
> > I believe a more correct scoring function won't be based on how much you
> win
> > by OR how often you win but will incorporate some other more relevant
> > concept and it will be dynamic.    And it will not matter if the game is
> > a handicap game or otherwise because the scoring function will always be
> > relevant.   The goal will be to maximize your winning chances but it
> > will incorporate something more sophisticated that just counting how
> often
> > you win or how much you win by.
>
> I hope I may interfere with something that Don's nice description
> revealed to me. It feels rather obvious, but since nobody stated it
> explicitly, maybe it's news for at least some people here.
>
> MCTS is maximizing the chances of winning. These chances are largest
> for a minimal score difference because this allows for making some
> errors. Winning by the largest possible score has rather small chances
> to happen because every move has to be perfect.
>
> The curve describing the probability of ending the game with a certain
> score is bell-shaped and MCTS explores the area beneath it, looking
> for winning moves. With handicap, the disadvantaged side is getting
> less samples explored, making it less likely to discover the really
> good moves. Dynamic komi shifts the bell left or right in order to
> equalize the sampling on both sides, but as mentioned it isn't dynamic
> enough (the curve changes after each move) and also is actually using
> a different shape for the curve than the real "handicap curve".
>
> In theory, I think that the solution for keeping the same level of
> play with handicap as without would be to make sure that the the
> disadvantaged side gets just as many samples with or without handicap.
> That is, use more playouts when playing with handicap. In practice,
> this is probably prohibitive...
>
> I wonder if it might be possible to estimate the shape of this curve
> after each move and use that estimate to dynamically adjust the number
> of playouts. One might have to use higher precision calculations, too,
> so that the noise doesn't get too loud.
>
> Does this make any sense? Has anyone tried something like this?
>
> best regards,
> Vlad
> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20120109/ef9c1557/attachment.html>