用户名: 密码: 验证码:
四国军棋智能系统定式库及开局匹配研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
人机博弈是人工智能的一个重要研究领域,其中不完全信息的人机博弈能够模拟现实复杂世界中不确定环境下的决策,因此越来越受到关注。四国军棋是一种典型的不完全信息游戏,其特点是不仅需要在对手和同盟棋子信息不确定的情况下做出决策,而且需要考虑与同盟的合作问题。目前四国军棋人机博弈研究存在的两个主要问题是:一、尽管针对四国军棋本身特点进行了搜索算法的研究,但是搜索的深度和结果,都还是难以令人满意;二、由于基础性研究还不够深入,目前没有好的评价函数。这两大瓶颈严重地影响了四国军棋人机博弈系统智能水平的高低。因此,有必要从其它方面入手对四国军棋开展研究。
     本文围绕四国军棋的人机博弈展开深入的研究与分析,主要工作如下:
     1)参考围棋的定式库技术、中国象棋和国际象棋的开局库技术和残局库技术,将定式库技术引入四国军棋的人机博弈研究。设计与实现了四国军棋的定式库以及相应的定式库开发系统,并在人机博弈系统中使用定式库技术来进行最优策略的决策。定式库技术在四国军棋博弈系统中的应用,降低了博弈系统对搜索算法的依赖,避免系统单纯依靠搜索算法而犯战略上的低级错误。
     2)针对棋手所用布局的倾向性和范围性,本文提出了一种基于样本的策略指导方法——开局匹配算法。该算法主要应用于开局阶段,根据开局阶段获得的少量信息,对待选的样本库进行快速地筛选,从而得到当前布局的假想布局,指导最优策略的决策。
     3)针对四国军棋的不完全信息特征,提出了四国军棋的蒙特卡罗算法。该算法通过单样本条件下的最优策略在整体样本条件下的模拟游戏,选出表现最好的策略作为最优策略。四国军棋的蒙特卡罗算法通过模拟游戏将不确定因素从评价函数中剥离出来,为评价函数的设计提供了新的思路。
     4)由于原有实验平台Nhope V1&V2版本所采用的系统框架主要侧重于博弈搜索,而且其智能模块的过程化的编程方式也使其可扩展性受到限制。定式库技术、开局匹配技术与原有的系统框架存在冲突,同时,为了增强实验平台的可扩展性,本文设计与实现了Nhope V3。Nhope V3实验平台在设计的过程中采用了面向对象的设计方法,同时注重结构设计的天然性和合理性,使得新的实验平台易于理解和扩展。
Game-playing is an important domain for Artificial Intelligence Research. Games with imperfect information that can simulate decision making under the uncertainty of the real world are becoming more and more popular. Siguo is a typical imperfect information game, where completing players make decision under conditions of uncertainty and have to cooperate with leaguing players. For the moment, there are two major issues in Siguo Computer Game-playing Research as follows: First, the Search Algorithm of Game-tree is still imperfect and could not search deep enough, though a lot of efforts have been maken in this area. Second, one of the basic elements in Computer Game-playing Research: construction method of evaluation function is still not so good. Those two major issues have seriously affected the development of Siguo Computer Game-playing Research. Therefore, it is necessary to start from other aspects of research on the Siguo.
     This paper has made detail research about Siguo Computer Game-playing and the main works are as follows:
     1) Reference to the Database of Joseki in Go, the Opening-book & the Endgame-library in Chess & Chinese Chess, this paper introduces the technology of Joseki-database into Siguo Computer Game-playing. This paper puts those particular partial conditions with relative strategies in the Database of Joseki and matches Joseki to find the best choice before Search Algorithm. With the Database of Joseki, the Game-playing system is smarter than before.
     2) Specific to the tendency and range-ability of the layouts used by one player, this paper introduces a new way based on samples to make the best choice: Quick Matching Algorithm. Quick Matching Algorithm is mainly applied in the stage of opening. Using the information gained by the stage of opening, we screen those samples quickly and finally get a sample that is the most qualified to guide the decision-making of the best choice.
     3) Aim at the imperfect-information feature of Siguo game, this paper uses Monte Carlo Algorithm to simulate game-playing and finds the best choice for the current situation. In this way, the evaluation function is no longer connected with the probabilities. Monte Carlo Algorithm provides a new way to design the evaluation function.
     4) Since the previous platforms Nhope V1 & V2 are designed for the Search Algorithm, they are not adapted to the technology of Joseki-database, QM Algorithm or Monte Carlo Algorithm, therefore we design and implement Nhope V3. Nhope V3 is designed with OOP and could be easily expanded.
引文
[1] H.Jaan van den Herik, Jos WHM Uiterwijk, Jack van Rijswijck. Games solved: Now and in the future. Artificial Intelligence, 2002, 134(1-2): 277~311.
    [2] C.E. Shannon. Programming a computer for playing chess. Philosophical Magazine, 1950, 41: 265~275.
    [3] A Bernstein, T Arbuckle, MDV Roberts, et al. A chess playing program for the IBM 704. Western joint computer conference: contrasts in computers, New York, USA: ACM, 1958: 157~159.
    [4] A. L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 1959, 3: 210~229.
    [5] A. Zorbrist. Feature Extractions and Representation for Pattern Recognition and the Game of Go, [PhD thesis]. Madison: University of Wisconsin, 1970.
    [6] D. Slate, L. Atkin. Chess 4.5-The Northwestern University chess program. P. Frey, Chess Skill in Man and Machine, Berlin: Springer, 1977: 82~118.
    [7] Donald E. Knuth, Ronald W. Moore. An analysis of alpha-beta pruning. Artificial Intelligence, 1975, 6(4):293~326.
    [8] Allen Newll, Cliff Shaw, Herbert Simon. Chess playing programs and the problem of complexity. IBM Journal of Research and Development, 1958, 4(2):320~335.
    [9] Arthur L.Samuel. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 1959, 3:211~229.
    [10] James H.Slagle, John K.Dixon. Experiments with some programs that search game trees. Journal of the ACM, 1969, 16(2):189~207.
    [11] A.L.Brudno. Bounds and valuations for shortening the search of estimates. Problems of Cybernetics, 1963, 10:225~241.
    [12] Gerard M.Baudet. The Design and Analysis of Algorithms for Asynchronous Multiprocessors, [PhD thesis]. Pittsburgh: Carnegie Mellon University, 1978.
    [13] Murray S.Campbell, T.Anthony Marsland. A comparison of minimax tree search algorithm. Artificial Intelligence, 1983, 20:347~367.
    [14] Richard E.Korf. Iterative deepening: An optimal admission tree search. Artificial Intelligence, 1985, 27:97~109.
    [15] Alexander Reinefeld, T.Anthony Marsland. Enhanced iterative-deepening search. IEEE Transactions on Pattern Analysis and Machine Intelligence, July 1994, 16(7):701~710.
    [16] Subir Bhattacharya, Amitava Bagchi. A faster alternative to SSS* with extension to variable memory. Information processing letters, September 1993, 47:209~214.
    [17] J. Schaeffer, J. Culberson, N. Treloar, et al. A World Championship Caliber Checkers Program. Artificial Intelligence, 1992, 53 (2-3): 273~289.
    [18] M. Campbell, A. Joseph Hoane Jr., Feng-hsiung Hsu. Deep Blue. Artificial Intelligence, 2002, 134: 57~83.
    [19] T. McGrew. Collaborative Intelligence: The Internet Chess Club on Game 2 of Kasparov vs. Deep Blue. IEEE Internet Computing, 1997, 1(3): 38~42.
    [20] M. Buro. The Othello match of the year: Takeshi Murakami vs. Logistello. ICCA, 1997, 20 (3): 189~193.
    [21] S.J. Yen, J.C. Chen, T.N. Yang. Computer Chinese Chess. ICGA, 2004, 3:3~18.
    [22] M. Buro. Improving heurisitic mini-max search by supervised learning. Artificial Intelligence, 2002, 134(1-2): 85~99.
    [23] V.V. Anshelevich. A hierarchical approach to computer Hex. Artificial Intelligence, 2002, 134: 101~120.
    [24] H. Iida, M. Sakuta, J. Rollason. Computer Shogi. Artificial Intelligence, 2002, 134(1-2): 121~144.
    [25] M. Müller. Computer Go. Artificial Intelligence, 2002, 134: 145~179.
    [26] G. Tesauro. Programming backgammon using self-teaching neural nets. Artificial Intelligence, 2002, 134(1): 181~199.
    [27] M.L. Ginsberg. GIB: Imperfect information in a computationally challenging game. Journal of Artificial Intelligence Research, 2001, 14: 303~358.
    [28] D. Billings, A. Davidson, J. Schaeffer, et al. The challenge of poker. Artificial Intelligence, 2002, 134 (1-2): 201~240.
    [29]王晓春. PC游戏编程(人机博弈).重庆:重庆大学出版社, 2002: 101~169.
    [30] A. Plaat. Research re: Search & re-search, [PhD thesis]. Rotterdam, The Netherlands: Tinbergen Institute and Depart. of Computer Science, Erasmus University, 1996.
    [31] Don F. Beal. A Generalised Quiescence Search Algorithm. Artificial Intelligence, 1990, 43: 85~98.
    [32] G. C. Stockman. A minimax algorithm better than Alpha-Beta. Artificial Intelligence, 1979,12(2):179~196.
    [33] ZhengYou Xia, YunAn Hu, Jian Wang, et al. Analyze and Guess Type of Piece in the Computer Game Intelligent System. FSDK(2), Lecture Notes in Computer Science, 2005, 3614: 1174~1183.
    [34]夏正友,朱勇平,陆慧等.四国军旗游戏:一个新的人工智能研究测试平台.徐心和等, 2006中国机器博弈学术研讨会论文集,沈阳:东北大学人工智能与机器人研究所, 2006: 62~65.
    [35] ZhengYou Xia, YongPing Zhu, Hui Lu. Evaluation Function for Siguo Game based on Two Attitudes. FSDK, Lecture Notes in Computer Science, 2006, 4223: 1322~1331.
    [36] ZhengYou Xia, YongPing Zhu, Hui Lu. Using the Loopy Belief Propagation in Siguo. ICGA, 2007, 28(4):209~220.
    [37] Hui Lu, ZhengYou Xia. AWT: Aspiration with Timer Search Algorithm in Siguo. Conference on Computers and Games, Lecture Notes in Computer Science, 2008, 5131:264~274.
    [38]陆慧,夏正友.四国军棋游戏中搜索算法的实验与分析.江南大学学报(自然科学版), 2007, 6(06): 744~748.
    [39]廉师友.人工智能技术导论.西安:西安电子科技大学出版社, 2002: 119~122.
    [40]徐心和,王骄.中国象棋计算机博弈关键技术分析.小型微型计算机系统, 2006, 27(6): 961~969.
    [41]师军.围棋与人工智能.中国体育科技, 2005, 41(6): l35~l38.
    [42]谷蓉,刘学民,朱仲涛等.一种围棋定式的机器学习方法.计算机工程, 2004, 30(6): 142~144.
    [43] ZOBRIST A L. A New Hashing Method with Application for Game Playing. ICCA Journal, 1990 :134~140.
    [44]魏钦刚,王骄,徐心和等.中国象棋计算机博弈开局库研究与设计.徐心和等, 2006中国机器博弈学术研讨会论文集(CCGC’06),沈阳:东北大学人工智能与机器人研究所, 2006: 66~71.
    [45] K. Thompson. 6-piece endgames. ICCA, 1996, 19(4): 215~226.
    [46]袁为.从”人机大战”看人工智能的未来.中国青年科技, 2004, 6: 42~48.
    [47] S. Russell, P. Norvig.人工智能:一种现代方法(第二版)(姜哲等译).北京:人民邮电出版社, 2004: 34~40.
    [48] Jun S. Liu. Monte Carlo Strategies in Scientific Computing. New York: Springer, 2001: 1~21.
    [49] WR Gilks, DJ Spiegelhalter. Markov chain Monte Carlo in practice. London: Chapman&Hall,1996:1~30.
    [50] W. K. Hastings. Monte Carlo Sampling Methods Using Markov Chains and Their Applications. Biometrika, 1970:1~15.
    [51] RH Swendsen, JS Wang. Nonuniversal critical dynamics in Monte Carlo simulations. Phys Rev Lett, 1987, 58(2): 86~88.
    [52] M. Ginsberg. GIB: Steps toward an expert-level bridge-playing program. T. Dean, IJCAI-99, San Francisco: Morgan Kaufmann, 1999: 584~589.
    [53] Stephen J. Smith, Dana Nau, Tom Throop. Computer Bridge: A Big Win for AI Planning. AI Magazine, 1998, 19(2): 93~105.
    [54] J Schaeffer, HJ Van den Herik. Games, computers and artificial intelligence. Artificial Intelligence, 2002, 134(1-2):1~7.
    [55] D. Billings, D. Papp, L. Pena, et al. Using selective-sampling simulations in poker. AAAI Spring Symposium on Search Techniques for Problem Solving under Uncertainty and Incomplete Information, 1999: 13~18.
    [56] D. Billings, D. Papp, J. Schaeffer, et al. Opponent modeling in poker. AAAI-98, WI: Madison, 1998: 493~499.
    [57] D. Billings, L. Pena, J. Schaeffer, et al. Using probabilistic knowledge and simulation to play poker. AAAI-99, FL: Orlando, 1999: 697~703.
    [58] CS Lee, MH Wang, G Chaslot, et al. The Computational Intelligence of MoGo Revealed in Taiwan's Computer Go Tournaments. Computational Intelligence and AI in Games, 2009, 1(1): 73~79.
    [59] GMJ Chaslot, MHM Winands, J Uiterwijk, et al. Progressive Strategies For Monte-Carlo Tree Search. New Mathematics and Natural Computation, 2008, 4(03): 343~357.
    [60] G.Chaslot, J.T. Saito, B. Bouzy, et al. Monte-Carlo Strategies for Computer Go. P.Y. Schobbens, W. Vanhoof and G. Schwanen, Proceedings of the 18th BeNeLux Conference on Artificial Intelligence, Belgium: Namur, 2006: 83~91.
    [61] B. Bruegmann. Monte carlo go. Unpublished, 1993.
    [62]朱勇平.四国军棋人机博弈系统的研究与实现, [硕士学位论文].南京:南京航空航天大学, 2008.
    [63] Jiawei Han, Micheline kamber. Data Mining: Concepts and Techniques.北京:机械工业出版社, 2006: 255~259.
    [64] Ian H.Witten, Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques.北京:机械工业出版社, 2006:93~96.
    [65]沈世镒,陈鲁生.信息论与编码理论.北京:科学出版社, 2002:20~30.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700