A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games

Li ZHANG; Yuxuan CHEN; Wei WANG; Ziliang HAN; Shijian Li; Zhijie PAN; Gang PAN

首页> 外文期刊>Frontiers of computer science >A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games

【24h】

A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games

机译：一个蒙特卡罗神经虚拟自助式自助方法，以近期信息动态游戏近似纳什均衡

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games, e.g., StarCraft and poker. Neural Fictitious Self-Play (NFSP) is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge. However, it needs to train a neural network in an off-policy manner to approximate the action values. For games with large search spaces, the training may suffer from unnecessary exploration and sometimes fails to converge. In this paper, we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP, called MC-NFSP, to improve the performance in real-time zero-sum imperfect-information games. With experiments and empirical analysis, we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not. Furthermore, we develop an Asynchronous Neural Fictitious Self-Play framework (ANFSP). It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality. The experiments with th e games with hidden state information (Texas Hold'em), and the FPS (firstperson shooter) games demonstrate effectiveness of our algorithms.

机译：解决纳什均衡点的优化问题在不完美的信息游戏中发挥着重要作用，例如星际争霸和扑克。神经虚拟自主游戏（NFSP）是一种有效的算法，它从纯粹的自我播放没有先前域知识的纯粹自我播放学习近似NASH均衡。但是，它需要以违规方式训练神经网络以近似行动值。对于具有大型搜索空间的游戏，培训可能遭受不必要的探索，有时无法收敛。在本文中，我们提出了一种新的神经虚拟自助式算法，将Monte Carlo树搜索与NFSP，称为MC-NFSP，改善实时零和不完全信息游戏的性能。通过实验和经验分析，我们证明所提出的MC-NFSP算法可以在NFSP不能的情况下在具有大规模搜索深度的游戏中近似纳什均衡。此外，我们开发了一个异步神经虚拟自主游戏框架（ANFSP）。它使用异步和并行架构来收集游戏体验并提高培训效率和政策质量。通过隐藏状态信息（德州HOLD'EM）和FPS（先生射击）游戏的实验，证明了我们算法的有效性。

著录项

来源
《Frontiers of computer science》 |2021年第5期|155334.1-155334.14|共14页
作者
Li ZHANG; Yuxuan CHEN; Wei WANG; Ziliang HAN; Shijian Li; Zhijie PAN; Gang PAN;
展开▼
作者单位

College of Computer Science and Technology Zhejiang University Hangzhou 310027 China;

College of Computer Science and Technology Zhejiang University Hangzhou 310027 China;

College of Computer Science and Technology Zhejiang University Hangzhou 310027 China;

College of Computer Science and Technology Zhejiang University Hangzhou 310027 China;

College of Computer Science and Technology Zhejiang University Hangzhou 310027 China;

College of Computer Science and Technology Zhejiang University Hangzhou 310027 China;

College of Computer Science and Technology Zhejiang University Hangzhou 310027 China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
approximate Nash Equilibrium; imperfect-information games; dynamic games; Monte Carlo tree search; Neural Fictitious Self-Play; reinforcement learning;

机译：近似纳什均衡;不完美的信息游戏;动态游戏;蒙特卡罗树搜索;神经虚拟自我扮演;加强学习;

相似文献

外文文献
中文文献
专利

1. An Approach for Reducing the Graphical Model and Genetic Algorithm for Computing Approximate Nash Equilibrium in Static Games [J] . Wei-Yi Liu, Kun Yue, Jin Li, Journal of Intelligent & Robotic Systems . 2010,第2期

机译：减少静态博弈中近似纳什均衡的图形模型和遗传算法的方法
2. An Approach for Reducing the Graphical Model and Genetic Algorithm for Computing Approximate Nash Equilibrium in Static Games [J] . Wei-Yi Liu, Kun Yue, Jin Li, Journal of Intelligent & Robotic Systems: Theory & Application . 2010,第2期

机译：减少静态博弈中近似纳什均衡的图形模型和遗传算法的方法
3. On dynamics and Nash equilibriums of networked games [J] . Cheng Daizhan, Xu Tingting, He Fenghua, Automatica Sinica, IEEE/CAA Journal of . 2014,第1期

机译：网络游戏的动力学和纳什均衡
4. Penalty Game with Mission Success Rates and Randomizing Mixed Nash Equilibrium Strategies Based on Monte Carlo Simulation [C] . Yicheng Gong, Xiaomeng Niu, Jingjing Yuan, International Conference on Modelling, Simulation and Applied Mathematics . 2017

机译：基于Monte Carlo仿真的特派团成功率和随机化混合纳什均衡策略的罚款
5. Neural Fictitious Self-Play を用いた不完全情報ゲームの均衡解の計算利用統計は来月からご利用いただけます [D] . 河村圭悟 2019

机译：下个月将提供使用神经网络虚拟自玩的不完全信息游戏均衡解决方案的计算使用统计信息
6. Bayesian Classification of Proteomics Biomarkers from SelectedReaction Monitoring Data using an Approximate Bayesian Computation-Markov ChainMonte Carlo Approach [O] . Kashyap Nagaraja, Ulisses Braga-Neto 2018

机译：蛋白质组学生物标志物的贝叶斯分类使用近似贝叶斯计算-马尔可夫链的反应监测数据蒙特卡洛方法
7. Monte Carlo Tree Search in Imperfect-Information Games [O] . Lisý Viliam 2014

机译：不完全信息游戏中的蒙特卡洛树搜索

A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games

摘要

著录项

相似文献

相关主题

期刊订阅