首页> 外文期刊>Nature >Mastering the game of Go with deep neural networks and tree search
【24h】

Mastering the game of Go with deep neural networks and tree search

机译:通过深度神经网络和树搜索掌握Go游戏

获取原文
获取原文并翻译 | 示例
           

摘要

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses 'value networks' to evaluate board positions and 'policy networks' to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.
机译:围棋游戏因其巨大的搜索空间以及评估棋盘位置和动作的难度,一直被视为人工智能经典游戏中最具挑战性的游戏。在这里,我们介绍了一种新的计算机围棋方法,该方法使用“价值网络”评估董事会职位,并使用“政策网络”选择举动。这些深度神经网络是通过人类专家游戏中的有监督学习和自玩游戏中的强化学习的新颖组合来训练的。在没有任何前瞻搜索的情况下,神经网络以最先进的蒙特卡洛树搜索程序的级别运行Go,该程序可模拟成千上万的自玩游戏。我们还将介绍一种新的搜索算法,该算法将蒙特卡洛模拟与价值和政策网络相结合。使用该搜索算法,我们的程序AlphaGo相对于其他Go程序获得了99.8%的获胜率,并以5场比赛(以0比0)击败了人类欧洲围棋冠军。这是计算机程序第一次全面击败人类职业选手尺寸的围棋游戏,这项壮举以前至少被认为是十年之久。

著录项

  • 来源
    《Nature》 |2016年第7587期|484-489|共6页
  • 作者单位

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google, 1600 Amphitheatre Pkwy, Mountain View, CA 94043 USA;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google, 1600 Amphitheatre Pkwy, Mountain View, CA 94043 USA;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

    Google DeepMind, 5 New St Sq, London EC4A 3TW, England;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);美国《生物学医学文摘》(MEDLINE);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号