Crawling in Rogue's Dungeons With Deep Reinforcement Techniques

Asperti Andrea; Cortesi Daniele; De Pieri Carlo; Pedrini Gianmaria; Sovrano Francesco

首页> 外文期刊>IEEE Transactions on Games >Crawling in Rogue's Dungeons With Deep Reinforcement Techniques

【24h】

Crawling in Rogue's Dungeons With Deep Reinforcement Techniques

机译：爬行罗伊的地下野，深入加固技术

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper is a report of our extensive experimentation, during the last two years, of deep reinforcement techniques for training an agent to move in the dungeons of the famous Rogue video game. The challenging nature of the problem is tightly related to the procedural, random generation of new dungeon maps at each level, which forbids any form of level-specific learning and forces us to address the navigation problem in its full generality. Other interesting aspects of the game from the point of view of automatic learning are the partially observable nature of the problem since maps are initially not visible and get discovered during exploration, and the problem of sparse rewards, requiring the acquisition of complex, nonreactive behaviors involving memory and planning. In this paper, we develop on previous works to make a more systematic comparison of different learning techniques, focusing in particular on Asynchronous Advantage Actor-Critic and Actor-Critic with Experience Replay (ACER). In a game like Rogue, sparsity of rewards is mitigated by the variability of the dungeon configurations (sometimes, by luck, exit is at hand); if this variability can be tamed-as ACER, better than other algorithms, seems able to do-the problem of sparse rewards can be overcome without any need of intrinsic motivations.

机译：本文是我们广泛的实验，在过去两年中，培训代理人在着名的流氓视频游戏的地下城的深度加固技术中报告。问题的挑战性质与每个级别的程序，随机生成的新地牢映射紧密相关，这禁止任何形式的级别专门的学习，并迫使我们在其全部普遍地解决导航问题。从自动学习的角度来看，游戏的其他有趣方面是问题的部分可观察性质，因为地图最初不可见并在探索期间被发现，以及稀疏奖励的问题，需要获取复杂的，不反应行为涉及记忆和规划。在本文中，我们开发了以前的作品，以便更系统的比较不同的学习技术，特别是对异步优势演员 - 评论家和演员 - 评论家与经验重播（宏碁）。在像流氓这样的游戏中，通过地牢配置的可变性减轻了奖励的稀疏性（有时，运气，退出是手头的）;如果这种可变性可以驯服 - 作为宏碁，比其他算法更好，似乎能够做出稀疏奖励的问题，而无需任何内在动机。

著录项

来源
《IEEE Transactions on Games》 |2020年第2期|177-186|共10页
作者
Asperti Andrea; Cortesi Daniele; De Pieri Carlo; Pedrini Gianmaria; Sovrano Francesco;
展开▼
作者单位

Univ Bologna Dept Informat Sci & Engn I-40127 Bologna Italy;

Univ Bologna Dept Informat Sci & Engn I-40127 Bologna Italy;

Univ Bologna Dept Informat Sci & Engn I-40127 Bologna Italy;

Univ Bologna Dept Informat Sci & Engn I-40127 Bologna Italy;

Univ Bologna Dept Informat Sci & Engn I-40127 Bologna Italy;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Actor-Critic with Experience Replay (ACER); Asynchronous Advantage Actor-Critic (A3C); attention; deep reinforcement learning; dungeon; experience replay; labyrinth; maze; partially observable Markov decision process (MDP); Rogue; sparsity of rewards;

机译：演员 - 批评经验重播（宏碁）;异步优势演员 - 评论家（A3C）;注意;深度加强学习;地牢;经历重播;迷宫;迷宫;部分观察到的马尔可夫决策过程（MDP）;流氓;瑞兹的稀疏性;

相似文献

外文文献
中文文献
专利

1. Crawling the Deep Web Using Asynchronous Advantage Actor Critic Technique [J] . Madan Kapil, Bhatia Rajesh Journal of web engineering . 2021,第3期

机译：使用异步优势演员批评技术爬行深度Web
2. I-SEE: Intelligent, Secure, and Energy-Efficient Techniques for Medical Data Transmission Using Deep Reinforcement Learning [J] . Allahham Mhd Saria, Abdellatif Alaa Awad, Mohamed Amr, Internet of Things Journal, IEEE . 2021,第8期

机译：我 - 见：使用深度加强学习的医疗数据传输智能，安全和节能技术
3. Accelerated deep reinforcement learning with efficient demonstration utilization techniques [J] . Yeo Sangho, Oh Sangyoon, Lee Minsu World Wide Web . 2021,第4期

机译：高效示范利用技术加速了深度加强学习
4. Crawling in Rogue's Dungeons with (Partitioned) A3C [C] . Andrea Asperti, Daniele Cortesi, Francesco Sovrano International Conference on Machine Learning, Optimization, and Data Science . 2019

机译：在Rogue的地牢中爬行（分区）A3C
5. Deep Reinforcement Learning with Accelerated Reward Function Technique for Robotics Task Planning [D] . Shaikh, Shifa. 2021

机译：机器人任务规划加速奖励功能技术的深增强学习
6. Changes in Selected Parameters of Swimming Technique in the Back Crawl and the Front Crawl in Young Novice Swimmers [O] . Damian Jerszyński, Katarzyna Antosiak-Cyrak, Małgorzata Habiera, 2013

机译：新手游泳者的后爬泳和前爬泳游泳技术选择参数的变化
7. Improving deep reinforcement learning with advanced exploration and transfer learning techniques [O] . Haiyan Yin -1

机译：高级勘探和转移学习技术改善深度加固学习
8. Focused Crawling of the Deep Web Using Service Class Descriptions [R] . Rocco, D., Liu, L., Critchlow, T. 2005

机译：使用服务类描述重点对Deep Web进行爬网

Crawling in Rogue's Dungeons With Deep Reinforcement Techniques

摘要

著录项

相似文献

相关主题

期刊订阅