Teaching Deep Convolutional Neural Networks to Play Go by Christopher Clark, Amos Storkey.
Abstract:
Mastering the game of Go has remained a long standing challenge to the field of AI. Modern computer Go systems rely on processing millions of possible future positions to play well, but intuitively a stronger and more ‘humanlike’ way to play the game would be to rely on pattern recognition abilities rather then brute force computation. Following this sentiment, we train deep convolutional neural networks to play Go by training them to predict the moves made by expert Go players. To solve this problem we introduce a number of novel techniques, including a method of tying weights in the network to ‘hard code’ symmetries that are expect to exist in the target function, and demonstrate in an ablation study they considerably improve performance. Our final networks are able to achieve move prediction accuracies of 41.1% and 44.4% on two different Go datasets, surpassing previous state of the art on this task by significant margins. Additionally, while previous move prediction programs have not yielded strong Go playing programs, we show that the networks trained in this work acquired high levels of skill. Our convolutional neural networks can consistently defeat the well known Go program GNU Go, indicating it is state of the art among programs that do not use Monte Carlo Tree Search. It is also able to win some games against state of the art Go playing program Fuego while using a fraction of the play time. This success at playing Go indicates high level principles of the game were learned.
The last line of the abstract caught my eye:
This success at playing Go indicates high level principles of the game were learned.
That statement is expanded in 4.3 Playing Go:
The results are very promising. Even though the networks are playing using a ‘zero step look ahead’ policy, and using a fraction of the computation time as their opponents, they are still able to play better then GNU Go and take some games away from Fuego. Under these settings GNU Go might play at around a 6-8 kyu ranking and Fuego at 2-3 kyu, which implies the networks are achieving a ranking of approximately 4-5 kyu. For a human player reaching this ranking would normally require years of study. This indicates that sophisticated knowledge of the game was acquired. This also indicates great potential for a Go program that integrates the information produced by such a network.
An interesting limitation that the network can’t communicate what it has learned. It can only produce an answer for a given situation. In gaming situations that opaqueness isn’t immediately objectionable.
But what if the situation was fire/don’t fire in a combat situation? Would the limitation that the network can only say yes or no, with no way to explain its answer, be acceptable?
Is that any worse than humans inventing explanations for decisions that weren’t the result of any rational thinking process?
Some additional Go resources you may find useful: American Go Association, Go Game Guru (with a printable Go board and stones), GoBase.org (has a Japanese dictionary). Those site will lead you to many other Go sites.