...
首页> 外文期刊>Neurocomputing >Hierarchical automatic curriculum learning: Converting a sparse reward navigation task into dense reward
【24h】

Hierarchical automatic curriculum learning: Converting a sparse reward navigation task into dense reward

机译:分层自动课程学习:将稀疏奖励导航任务转换为密集奖励

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Mastering the sparse reward or long-horizon task is critical but challenging in reinforcement learning. To tackle this problem, we propose a hierarchical automatic curriculum learning framework (HACL), which intrinsically motivates the agent to hierarchically and progressively explore environments. The agent is equipped with a target area during training. As the target area progressively grows, the agent learns to explore from near to far, in a curriculum fashion. The pseudo target-achieving reward converts the sparse reward into dense reward, thus the long-horizon difficulty is alleviated. The whole system makes hierarchical decisions, in which a high-level conductor travels through different targets, and a low-level executor operates in the original action space to complete the instructions given by the high-level conductor. Unlike many existing works that manually set curriculum training phases, in HACL, the total curriculum training process is automated and suits the agent's current exploration capability. Extensive experiments on three sparse reward tasks, long-horizon stochastic chain, grid maze, and the challenging Atari game Montezuma's Revenge, show that HACL achieves comparable or even better performance but with significantly less training frames. (C) 2019 Elsevier B.V. All rights reserved.
机译:掌握稀疏的奖励或长期任务很关键,但在强化学习中具有挑战性。为了解决这个问题,我们提出了一个分层的自动课程学习框架(HACL),该框架从本质上激发了Agent进行分层和逐步探索的环境。代理在训练过程中配备有目标区域。随着目标区域的逐渐增长,代理人将以课程的方式学习从近到远的探索。伪目标达成奖励将稀疏奖励转换为密集奖励,从而减轻了长期视线的困难。整个系统进行分层决策,其中高层指挥者穿越不同的目标,而低级执行者在原始动作空间中操作以完成高层指挥者给出的指令。与许多现有的手动设置课程培训阶段的作品不同,在HACL中,整个课程培训过程是自动化的,并适合代理商当前的探索能力。在三个稀疏奖励任务,长水平随机链,网格迷宫和具有挑战性的Atari游戏蒙特祖玛的Revenge中进行的大量实验表明,HACL的性能可与之媲美甚至更好,但训练框架却少得多。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号