|Table of Contents|

Monte Carlo Tree Search for“Dou Di Zhu”Based on Splitting(PDF)

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

Issue:
2019年03期
Page:
107-114
Research Field:
·全国机器学习会议论文专栏·
Publishing date:

Info

Title:
Monte Carlo Tree Search for“Dou Di Zhu”Based on Splitting
Author(s):
Peng QiwenWang YisongYu XiaomingLiu ManyiXu Fangjing
School of Computer Science and Technology,Guizhou University,Guiyang 550025,China
Keywords:
Dou Di Zhucomputer gamereinforcement learningMonte Carlo tree search
PACS:
TP311
DOI:
10.3969/j.issn.1001-4616.2019.03.014
Abstract:
“Dou Di Zhu”is a typical multiplayer cooperative game with incomplete information. Monte Carlo tree search is an important tool to solve game problems(Go,chess,etc.). Firstly,this paper proposes a hand splitting algorithm based on the rules of “Dou Di Zhu”,which solves the problem of large action space by choosing smaller splitting. Secondly,this paper adopts Monte Carlo sampling method to simulate the incomplete cooperative game of “Dou Di Zhu”. After satisfying certain preset conditions,this paper chooses the node with the best income as the best decision. The experi-mental results show that the Monte Carlo tree search based on hand splitting can realize the automatic game of“Dou Di Zhu”in a smart way.

References:

[1] 张维迎. 博弈论与信息经济学[M]. 上海:上海人民出版社,2004.
[2]张加佳. 非完全信息机器博弈中风险及对手模型的研究[D]. 哈尔滨:哈尔滨工业大学,2015.
[3]VON N J,MORGENSTERN O. Theory of games and economic behavior[M]. Princton:Princeton University Press,1994.
[4]SHANNON C E. Programming a computer for playing chess[M]//Computer chess compendium. New York,USA:Springer,1988:2-13.
[5]ROIZEN I,PEARL J. A minimax algorithm better than alpha-beta?Yes and No[J]. Artificial intelligence,1983,21(1/2):199-220.
[6]FULLER S H,GASCHNIG J G,GILLOGLY J J. Analysis of the alpha-beta pruning algorithm[M]. USA:Carnegie-Mellon University,1973.
[7]GELLY S,SILVER D. Combining online and offline knowledge in UTC[C]//Proceedings of the 24th International Conference on Machine Learning. New York,USA:ACM,2007:273-280.
[8]CHASLOT G,BAKKES E,SZITA I. Monte-Carlo tree search:a new framework for game AI[J]. In Proceedings of AIIDEC-08,2008,4(2):216-217.
[9]SILVER D,HUANG A,MADDISON C J,et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature,2016,529(7587):484-489.
[10]刘洋. 点格棋博弈中UCT算法的研究与实现[D]. 安徽:安徽大学,2016.
[11]SILVER D,HUBERT T,SCHRITTWIESER J,et al. A general reinforcement learning algorithm that masters chess,shogi,and Go through self-play[J]. Science,2018,362(6419):1140-1144.
[12]SILVER D,SCHRITTWIESER J,SIMONYAN K,et al. Mastering the game of Go without human knowledge[J]. Nature,2017,550(7676):354-359.
[13]BROWN N,SANDHOLM T. Superhuman AI for heads-up no-limit poker:Libratus beats top professionals[J]. Science,2018,359(6374):418-424.
[14]DARSE B,AARON D,JONATHAN S,et al. The challenge of poker[J]. Artificial intelligence,2002,134:201-240.
[15]STURTEVANT N. Current challenges in multi-player game search[C]//International Conference on Computers and Games. Berlin,Heidelberg:Springer,2004:285-300.
[16]GELLY S,SILVER D. Monte-Carlo tree search and rapid action value estimation in computer Go[J]. Artificial intelligence,2011,175(11):1856-1875.
[17]SCHADD F C. Monte-Carlo search techniques in the modern board game thurn and taxis[D]. Netherlands:Maastricht University,2009.
[18]COULOM R. Effcient selectivity and backup operators in Monte-Carlo tree search[C]//5th Int Conf Comput and Gamesn Turin. Italy:Natural Comput,2006:72-83.
[19]GELLY S,WANG Y. Exploration exploitation in Go:UCT for Monte-Carlo go[C]//NIPS:Neural Information Processing Systems Conference On-line trading of Exploration and Exploitation Workshop. Canada:ffhal-00115330f,2006.

Memo

Memo:
-
Last Update: 2019-09-30