<AI>Devspace

Why did AlphaGo lose its Go game?

clock icon
asked 1 week ago
message icon
1
eye icon
2.1K

We can read on wiki page that in March 2016 AlphaGo AI lost its game (1 of 5) to Lee Sedol, a professional Go player. One article cite says:

AlphaGo lost a game and we as researchers want to explore that and find out what went wrong. We need to figure out what its weaknesses are and try to improve it.

Have researchers already figured it out what went wrong?

1 Answer

We know what Lee's strategy was during the game, and it seems like the sort of thing that should work. Here's an article explaining it. Short version: yes, we know what went wrong, but probably not how to fix it yet.

Basically, AlphaGo is good at making lots of small decisions well, and managing risk and uncertainty better than humans can. One of the things that's surprising about it relative to previous bots that play Go is how good it was at tactical fights; in previous games, Lee had built a position that AlphaGo needed to attack, and then AlphaGo successfully attacked it.

So in this game, Lee played the reverse strategy. Instead of trying to win many different influence battles, where AlphaGo had already shown it was stronger than him, he would set up one critical battle (incurring minor losses along the way), and then defeat it there, with ripple events that would settle the match in his favor.

So what's the weakness of AlphaGo that allowed that to work? As I understand it, this is a fundamental limitation of Monte Carlo Tree Search (MCTS). MCTS works by randomly sampling game trees and averaging them; if 70% of games from a particular position go well and 30% of games from another position go well, then you should probably play the first move instead of the second move.

But when there's a specific sequence of plays that go well--if, say, W has a path that requires them playing exactly the right stone each time, but B has no possible response to this path--then MCTS breaks down, because you can only find that narrow path through minimax reasoning, and moving from the slower minimax reasoning to the faster MCTS is one of the big reasons why bots are better now than they were in the past.

It's unclear how to get around this. There may be a way to notice this sort of threat, and then temporarily switch from MCTS reasoning to minimax reasoning, or to keep around particular trajectories in memory for consideration in future plays.

1

Write your answer here

Why did AlphaGo lose its Go game?