[Computer-go] Are 4 'easy to avoid errors' common to all MC programs?

Mon Jan 24 06:51:49 PST 2011

On Mon, Jan 24, 2011 at 4:56 AM, Stefan Kaitschick <
Stefan.Kaitschick at hamburg.de> wrote:

>
>  My experiments were are 9x9 too.
>>
>> I believe what was happening with my implementations of this is that it
>> worked well most of the time,  but not when it really mattered.    When it
>> didn't work,  it was turning a simple win into a struggle and sometimes a
>> loss.
>>
>> Don
>>
>
> Why would a simple win be lost? Within competitive winrate bounds, there
> shouldn't even be dynamic komi.
> Only positions where the program thinks that one side should almost resign
> should be affected.
>

There are two sides to this,   when you are losing with almost certainty and
when you are winning with near certainty.    If you only address one of
these cases you don't really solve the undesirable behavior.

> Does that mean that positions were lost because the program burdend itself
> with a negative komi in a high winrate position and played needlessly
> adventurous moves?
>

On the winning side,  say the program is 90% happy and you raise the
requirement by one point in the appropriate direction.   In most positions
as I reported on this can make it play more cosmetically appealing.    But
this might make the probability of winning go from 90% to 75% and i this
means the program has to work around some wins because you have artificially
defined them as not enough.    I discovered there are a lot of positions
where winning by 1 point was easy but winning by 2 points was nearly
impossible.     In one of my experiments the program adjusted even for this
by reverting to the previous komi if it was pushed too hard.    In that
case,  the only problem is that you lost some time.

This makes me wonder if in the experiments of other time was ignored as a
factor.    Any algorithm that is based on dynamically detecting and
researching is not free,  certainly not in a time control based game.

On the losing side there is the additional argument that it might pay to try
to continue to play reasonable moves while waiting for a mistake by the
opponent or discovering a hidden opportunity.     I'm not sure how good that
argument is,  but it certainly sounds appealing.

The entire idea certainly sounds appealing in both directions and in fact I
think it DOES improve the play not only cosmetically but in practical ways,
 if you pick and choose specific examples and games.     But if you admit
this,  you also cannot avoid the fact that it has a downside.   The only
relevant question is whether it helps in the long run.   How many games does
it lose?   This must be tallied into the total.

> If that is the case, dyn. komi should not be used as a fix for the "win by
> 0.5" syndrome, but only in the opening and middle game.
>

That is unsatisfying.   My whole reason for trying this idea was
to cosmetically improve the play without affecting the actual strength of
the program.    I would have been overjoyed to find a real improvement,  but
I was willing to accept a very small ELO loss in order to get more
cosmetically appealing play.

> To fix the 0.5 mess, maybe the score could be slightly incorporated into
> the evaluation function when positions are stable. (When the playouts come
> back with few total win/total loss results)
>

In my view,  that is the right idea.   If the winning percentage (whether up
or down) is virtually the same for several moves, and changing the komi in a
more opportunistic direction does not affect this,  then it's surely a win.

One idea I had and never tested in any kind of thorough way is to
incorporate the actual final score into the least significant bits of the
win/loss tally.    For example in 9x9 one could consider the game a zero sum
game where 10000 points are at stake and the most you can get is 10000
points for a win by 81 points.     So if the loser hangs on to a little bit
of territory he would still get a small amount of credit for a fraction of a
win.   However,  I believe the credit should be quite small for territory.
 The little bits should be small enough that it would take a lot of them to
equal a game,  but this could be tuned to whatever works best.

Has this idea ever been fleshed out well?

>
> Stefan
>
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20110124/3fb555e3/attachment.html>