[Computer-go] replacing dynamic komi with a scoring function

Mon Jan 9 04:17:01 PST 2012

On Sun, Jan 8, 2012 at 7:22 PM, Stefan Kaitschick <
stefan.kaitschick at hamburg.de> wrote:

> scoring function:
>
> The most successful scoring function sofar is the win/lose function.
> Sigmoid functions and other schemes have been tried, but none have
> surpassed or even equaled the simple step function.
>
> dynamic komi:
>
> dynamic komi is widely used by bots in handicap games.
> An initial artificial komi burden is placed on black which is
> incrementally reduced to zero during the game.
> This gives the bot a "more realistic" goal as white, and a motivation not
> to play slackly as black.
>
>
> One of the traps of dynamic komi is that the bot will be willing to
> simplify as white, if only he is catching up quickly enough, as specified
> by the dynamic komi.
>

A problem with any dynamic komi scheme is that it is treats all scenarios
as if they are the same.   A game could end with a 1 stone win because it
was the simplest way to win despite the possibility of winning more, or it
might have have actually been a very close game.

So any solution based on komi manipulation outside the search seems like a
kludge or hack.     It's also a static algorithm by the definition I am
comfortable with,  it is prepossessed and then uniformly applied to the
search without change - only between searches does it change.     Probably
just semantics,   but I mention it because I think this is its biggest
flaw.

When you have a handicap situation,  the problem is that the computer
already knows who is winning.   In a game theoretic sense the
person receiving the handicap stones has a won game.     So just saying
that the so called "dynamic komi" is bad is no solution.  Even an ad-hoc
solution is better than none if it constitutes an improvement.

I doubt I have much to lend to this discussion except to say that we should
seek a solution that is dynamic.    Our scoring function should be one
which improves our chances to win even if we are dead lost and it should be
dynamic,  not defined by considerations outside the search that may or may
not have any relevance in any given situation.     We have already proved
that the goal to "win by as much as possible" is a step backwards,
 otherwise we would be using the simple area tally as our scoring function
and yet we still try to impose this concept - probably because it goes
against our intuition.   We are used to being more impressed by big wins in
real life.

I would like to suggest that we focus on the concept of finding a dynamic
scoring function that is more in sync with goal of maximizing your winning
chances.    Not giving up when losing clearly maximizes your winning
chances,  but it is obviously incorrect to equate that  with "losing by as
little as possible" which still chooses a goal of losing the game.      If
I were a strong go player I think I could propose something more specific
but I'm sorry that I cannot.    A naive suggestion:   A scoring function
could be developed based on more than just the final result of the playout,
 incorporate also a static evaluation feature or something happening inside
the playouts such as some measure of volatility.   The way the playout
developed could be relevant.

Summary:

I believe a more correct scoring function won't be based on how much you
win by OR how often you win but will incorporate some other more relevant
concept and it will be dynamic.    And it will not matter if the game is
a handicap game or otherwise because the scoring function will always be
relevant.   The goal will be to maximize your winning chances but it
will incorporate something more sophisticated that just counting how often
you win or how much you win by.

Don

> The bot can then find itself in a played out position, where catching up
> the final margin proves impossible.
> Conceptually, it would be nice to search at several different komi levels,
> from the komi needed for a 50% winrate(honest play), to the correct komi.
> If these values are added up(possibly weighted in some way), the result
> should be the most honest move that still has long term perspective.
> Doing this would have an obvious drawback though: less playouts per komi
> level, probably resulting in an overall weaker game.
> So here is my idea: introduce a scoring function that has the 0-border at
> the komi needed for a 50% winrate, and the 1-border at +1(as in the step
> function)
> (this would be if the bot is trailing, otherwise from 0 to komi)
> I have no idea what would be the proper function between those points,
> maybe a half sigmoid function, possibly just a straight line.
>
> The idea is not to maximize the score, but to capture the results of
> different komi levels in a single search, so as not to lose playouts.
>
>
> Stefan
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20120109/0471e092/attachment.html>