首页> 外文会议>BICA Society., Meeting >An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration
【24h】

An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration

机译:本质上动机的机器人探讨了具有输出仲裁的非奖励环境

获取原文

摘要

In real worlds, rewards are easily sparse because the state space is huge. Reinforcement learning agents have to achieve exploration skills to get rewards in such an environment. In that case, curiosity defined as internally generated rewards for state prediction error can encourage agents to explore environments. However, when a robot learns its policy by reinforcement learning, changing outputs of the policy cause jerking because of inertia. Jerking prevents state prediction from convergence, which would make the policy learning unstable. In this paper, we propose Arbitrable Intrinsically Motivated Exploration (AIME), which enables robots to stably learn curiosity-based exploration. AIME uses Accumulator Based Arbitration Model (ABAM) that we previously proposed as an ensemble learning method inspired by prefrontal cortex. ABAM adjusts motor controls to improve stability of reward generation and reinforcement learning. In experiments, we show that a robot can explore a non-reward simulated environment with AIME.
机译:在现实世界中,由于国家空间巨大,奖励很稀疏。强化学习代理必须实现在这种环境中获得奖励的探索技能。在这种情况下,定义为状态预测误差的内部生成奖励的好奇心可以鼓励代理商探索环境。但是,当机器人通过加强学习来学习其政策时,改变政策的输出因惯性而导致混乱。 Jerking防止了融合的状态预测,这将使策略学习不稳定。在本文中,我们提出了可备注的内在动机勘探(AIME),使机器人能够稳定地学习基于好奇心的探索。 AIME采用基于蓄电池仲裁示范(ABAM),我们先前提出由前额叶皮层的启发合奏学习方法。 ABAM调整电机控制,以提高奖励生成和加强学习的稳定性。在实验中,我们表明机器人可以用气动画探索非奖励模拟环境。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号