K-means Pattern Learning for Move Evaluation in the Game of Go

详细信息查看全文

作者：Yunzhao Liang (21)
Shuoying Chen (21)
关键词：Go ; pattern learning ; K ; means
刊名：Lecture Notes in Computer Science
出版年：2014
出版时间：2014
年：2014
卷：8862
期：1
页码：484-495
全文大小：1,168 KB
参考文献：1. Slany, W.: The complexity of graph Ramsey games. In: Marsland, T., Frank, I. (eds.) CG 2001. LNCS, vol.聽2063, pp. 186鈥?03. Springer, Heidelberg (2002) CrossRef
2. Hsieh, M.Y., Tsai, S.-C.: On the fairness and complexity of generalized k-in-a-row games. Theoretical Computer Science聽385(1), 88鈥?00 (2007) CrossRef
3. Allis, V.L.: Searching for solutions in games and artificial intelligence (1994)
4. Shannon, C.E.: Programming a computer for playing chess. Philosophical Magazine聽41(314), 256鈥?75 (1950)
5. Reisch, S.: Gobang ist PSPACE-vollst盲ndig. Acta Informatica聽13(1), 59鈥?6 (1980) CrossRef
6. Robson, J.M.: The Complexity of Go. In: IFIP Congress, pp. 413鈥?17 (1983)
7. Campbell, M., Hoane Jr., A.J., Hsu, F.-H.: Deep blue. Artificial Intelligence聽134, 157鈥?83 (2002) CrossRef
8. van der Werf, E.C.D., Van Den Herik, H.J., Uiterwijk, J.W.H.M.: Solving Go on Small Boards. ICGA Journal聽26(2), 92鈥?07 (2003)
9. Bouzy, B., Cazenave, T.: Computer Go: An AI oriented survey. Artificial Intelligence聽132, 39鈥?03 (2001) CrossRef
10. Browne, C.B., Powley, E., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of monte carlo tree search methods. Computational Intelligence and AI in Games聽4(1), 1鈥?3 (2012) CrossRef
11. Press, W.H.: Numerical recipes, 3rd edn. The art of scientific computing (2007)
12. Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation聽6(2), 215鈥?19 (1994) CrossRef
13. Schraudolph, N.N., Dayan, P., Sejnowski, T.J.: Temporal difference learning of position evaluation in the game of Go. In: Advances in Neural Information Processing Systems, p. 817 (1994)
14. Ekker, R., van der Werf, E.C.D., Schomaker, L.R.B.: Dedicated TD-learning for Stronger Gameplay: Applications to Go (2004)
15. Ghory, I.: Reinforcement learning in board games. Department of Computer Science, University of Bristol, Tech. Rep. (2004)
16. Gelly, S., Wang, Y., Munos, R., Teytaud, O., et al.: Modification of UCT with patterns in Monte-Carlo Go (2006)
17. Gelly, S., Silver, D.: Achieving Master Level Play in 9 x 9 Computer Go. In: AAAI, vol. 8, pp. 1537鈥?540 (2008)
18. Coulom, R.: Computing elo ratings of move patterns in the game of go. In: Computer Games Workshop (2007)
19. Stern, D., Herbrich, R., Graepel, T.: Bayesian pattern ranking for move prediction in the game of Go. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 873鈥?80 (2006)
20. Graepel, T., Goutri茅, M., Kr眉ger, M., Herbrich, R.: Learning on graphs in the game of go. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) ICANN 2001. LNCS, vol.聽2130, p. 347. Springer, Heidelberg (2001) CrossRef
21. Ralaivola, L., Wu, L., Baldi, P.: SVM and pattern-enriched common fate graphs for the game of Go (2005)
22. Coates, A., Ng, A.Y.: Learning feature representations with K-means. In: Montavon, G., Orr, G.B., M眉ller, K.-R. (eds.) NN: Tricks of the Trade, 2nd edn. LNCS, vol.聽7700, pp. 561鈥?80. Springer, Heidelberg (2012)
23. Coates, A., Ng, A.Y., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: International Conference on Artificial Intelligence and Statistics, pp. 215鈥?23 (2011)
24. Xu, R., Wunsch, D., et al.: Survey of clustering algorithms. Neural Networks聽16(3), 645鈥?78 (2005) CrossRef
25. Kocsis, L., Szepesv谩ri, C.: Bandit based monte-carlo planning. In: F眉rnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol.聽4212, pp. 282鈥?93. Springer, Heidelberg (2006)
作者单位：Yunzhao Liang (21)
Shuoying Chen (21)

21. Beijing Institute of Technology, China
ISSN：1611-3349

文摘

The Game of Go is one of the biggest challenge in the field of Computer Game. The large board makes Go very complex and hard to evaluate. In this paper, we propose a method that reduce the complexity of Go by learning and extracting patterns from game records. This method is more efficient and stronger than the baseline method we have chosen. Our method has two major components: a) a pattern learning method based on K-means, it will learn and extract patterns from game records, b) a perceptron which learns the win rates of Go situations. We build an agent to evaluate the performance of our method, and get at least 20% of performance improvement or 25% of computing power saving in most circumstances.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700