首页> 外文会议>International Conference on Learning and Intelligent Optimization >Consistency Modifications for Automatically Tuned Monte-Carlo Tree Search
【24h】

Consistency Modifications for Automatically Tuned Monte-Carlo Tree Search

机译:自动调谐的Monte-Carlo树搜索的一致性修改

获取原文

摘要

Monte-Carlo Tree Search algorithms (MCTS [4,6]), including upper confidence trees (UCT [9]), are known for their impressive ability in high dimensional control problems. Whilst the main test-bed is the game of Go, there are increasingly many applications [13,12,7]; these algorithms are now widely accepted as strong candidates for high-dimensional control applications. Unfortunately, it is known that for optimal performance on a given problem, MCTS requires some tuning; this tuning is often handcrafted or automated, with in some cases a loss of consistency, i. e. a bad behavior asymptotically in the computational power. This highly undesirable property led to a stupid behavior of our main MCTS program MoGo in a real-world situation described in section 3. This is a big trouble for our several works on automatic parameter tuning [3] and the genetic programming of new features in MoGo. We will see in this paper: A theoretical analysis of MCTS consistency; Detailed examples of consistent and inconsistent known algorithms; How to modify a MCTS implementation in order to ensure consistency, independently of the modifications to the "scoring" module (the module which is automatically tuned and genetically programmed in MoGo); As a by product of this work, we'll see the interesting property that some heavily tuned MCTS implementations are better than UCT in the sense that they do not visit the complete tree (whereas UCTasymptotically does), whilst preserving the consistency at least if "consistency" modifications above have been made.
机译:Monte-Carlo树搜索算法(MCTS [4,6]),包括上置信度树(UCT [9]),以其令人印象深刻的高维控制问题。虽然主要试验床是游戏,但越来越多的应用[13,12,7];这些算法现在被广泛接受为高维控制应用的强候选者。不幸的是,已知在给定问题上的最佳性能,MCT需要一些调整;这种调整通常是手工制作或自动化的,在某些情况下,丧失一致性,我e。在计算能力中渐近的不良行为。这种高度不良的财产导致我们的主要MCTS程序Mogo在第3节中描述的真实情况下的愚蠢行为。这对我们的几种在自动参数调整[3]以及新功能的遗传编程以及新功能的基因编程Mogo。我们将在本文中看到:对MCTS一致性的理论分析;一致和不一致的已知算法的详细示例;如何修改MCTS实现,以确保一致性,独立于对“得分”模块的修改(在Mogo中自动调谐和转基因的模块);作为这项工作的产物,我们会看到一些重大调整的MCT实现的有趣财产比在他们不访问完整树(而UctasymptotoTicle)的意义上更好,同时保留了至少如果“一致性“上述修改。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号