[Computer-go] Are 4 'easy to avoid errors' common to all MC programs?

Mon Jan 24 02:27:18 PST 2011

  Hi!

On Mon, Jan 24, 2011 at 01:00:30PM +0900, Darren Cook wrote:
> Without dynamic komi the program will choose the move to maximize its
> winning rate. If dynamic komi causes a different move to be chosen then
> it implies it is choosing a move that it thinks has a less than or equal
> chance of winning.

  Ok, but the point of dynamic komi is to compensate for situation where
what it thinks is not aligned with reality, e.g. during too much noise
in case the winrate is too high or too low. If you win all your games,
how do you decide which move is more likely to win them?

> Also, I believe Don did some (self-play?) experiments a few years ago
> and the dynamic komi version lost more; I think this is where Don's
> coolness to dynamic komi comes from (apologies if my memory is inaccurate).
> 
> However, Magnus has some experiments with Valkyria where the dynamic
> komi version was stronger even in non-handicap self-play games.

  Well, _my_ evidence is http://pasky.or.cz/~pasky/go/dynkomi.pdf as I
already described some time ago on the mailing list.

> >> I.e. program endgame is generally stronger than the humans of the same
> >> rank; chances are a 1-dan human will make a few 1pt or 2pt errors during
> >> the endgame.
> > 
> > I also think this is not obviously true at all. My observations have
> > been that MCTS does not perform too well at all in very close endgames.
> > (Though it is not a big disadvantage in practice since it is in the
> > nature of MCTS to strive for deciding the game ASAP, i.e. in the middle
> > game.)
> 
> My study has mostly been of 9x9 games, and as long as there is not a
> seki on the board the MCTS programs will practically never lose if they
> have a winning position at move 30. (Extra condition for that statement:
> Chinese rules.)

  Ah, I see. I agree that on 9x9, endgame is easy (for MCTS).

On Mon, Jan 24, 2011 at 04:31:51AM -0500, Don Dailey wrote:
> My coolness is based on experiments with thousands of games, or even tens of
> thousands of games.   The only way you can reliably measure a small
> improvement is to play many thousands of games.   If the improvement is
> large,   you might get by with a few hundred.     If your program is well
> developed no single improvement is going to give you more than a few ELO
> points of improvement so you are pretty much required to play several
> thousand games and I doubt many here are doing that.

  My experience is that if I'm careful to weed out all the bugs, vast
majority of the improvements I test and that work do amount to 10 elo
or more. (Of course, most of the "improvements" I test do not work at
all. ;-)

> Nevertheless,   that does not mean I stumbled on the
> right implementation and proved it does not work.   It's quite possible that
> I just never found a good way to do it.
> 
> 
> > However, Magnus has some experiments with Valkyria where the dynamic
> > komi version was stronger even in non-handicap self-play games.
> >
> > >> I.e. program endgame is generally stronger than the humans of the same
> > >> rank; chances are a 1-dan human will make a few 1pt or 2pt errors during
> > >> the endgame.
> > >
> > > I also think this is not obviously true at all. My observations have
> > > been that MCTS does not perform too well at all in very close endgames.
> > > (Though it is not a big disadvantage in practice since it is in the
> > > nature of MCTS to strive for deciding the game ASAP, i.e. in the middle
> > > game.)
> >
> > My study has mostly been of 9x9 games, and as long as there is not a
> > seki on the board the MCTS programs will practically never lose if they
> > have a winning position at move 30. (Extra condition for that statement:
> > Chinese rules.)
> >
> 
> My experiments were are 9x9 too.

  I have my doubts about effectivity on 9x9; I have done all my tests on
19x19 and never got any significant effect on 9x9 I think, but I did not
even try much.

> I believe what was happening with my implementations of this is that it
> worked well most of the time,  but not when it really mattered.    When it
> didn't work,  it was turning a simple win into a struggle and sometimes a
> loss.

  I have found it important to use a ratchet that stops dynamic komi
from putting more artificial disadvantage at the point the program loses
its advantage, and never expire this ratchet throughout the game. So
dynamic komi will compensate over initial advantage, but not kick in
ever again if the game gets complicated.

-- 
				Petr "Pasky" Baudis
Computer science education cannot make an expert programmer any more
than studying brushes and pigment can make an expert painter. --esr