...
首页> 外文期刊>Autonomous Mental Development, IEEE Transactions on >Model-Free Reinforcement Learning of Impedance Control in Stochastic Environments
【24h】

Model-Free Reinforcement Learning of Impedance Control in Stochastic Environments

机译:随机环境中阻抗控制的无模型强化学习

获取原文
获取原文并翻译 | 示例
           

摘要

For humans and robots, variable impedance control is an essential component for ensuring robust and safe physical interaction with the environment. Humans learn to adapt their impedance to specific tasks and environments; a capability which we continually develop and improve until we are well into our twenties. In this article, we reproduce functionally interesting aspects of learning impedance control in humans on a simulated robot platform. As demonstrated in numerous force field tasks, humans combine two strategies to adapt their impedance to perturbations, thereby minimizing position error and energy consumption: 1) if perturbations are unpredictable, subjects increase their impedance through cocontraction; and 2) if perturbations are predictable, subjects learn a feed-forward command to offset the perturbation. We show how a 7-DOF simulated robot demonstrates similar behavior with our model-free reinforcement learning algorithm ${rm PI}^{2}$, by applying deterministic and stochastic force fields to the robot's end-effector. We show the qualitative similarity between the robot and human movements. Our results provide a biologically plausible approach to learning appropriate impedances purely from experience, without requiring a model of either body or environment dynamics. Not requiring models also facilitates autonomous development for robots, as prespecified models cannot be provided for each environment a robot might encounter.
机译:对于人类和机器人来说,可变阻抗控制是确保与环境进行稳健而安全的物理交互的重要组成部分。人类学会使其阻抗适应特定的任务和环境。我们不断发展和提高的能力,直到我们进入二十岁。在本文中,我们在模拟机器人平台上重现了人类学习阻抗控制的有趣功能。正如在众多力场任务中所展示的那样,人类结合了两种策略来使他们的阻抗适应微扰,从而最大程度地减少位置误差和能量消耗:1)如果微扰无法预测,则受试者会通过共同收缩来增加其阻抗; 2)如果扰动是可预测的,则受试者学习前馈命令以抵消扰动。我们展示了7自由度模拟机器人如何通过将确定性和随机力场应用于机器人的末端执行器,利用我们的无模型强化学习算法$ {rm PI} ^ {2} $展示类似的行为。我们展示了机器人与人类运动之间的定性相似性。我们的结果提供了一种生物学上可行的方法,可以纯粹从经验中学习适当的阻抗,而无需建立人体或环境动力学模型。不需要模型也有利于机器人的自主开发,因为无法为机器人可能遇到的每种环境提供预先指定的模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号