首页> 外文会议>BICA Society., Meeting >On Stable Profit Sharing Reinforcement Learning with Expected Failure Probability
【24h】

On Stable Profit Sharing Reinforcement Learning with Expected Failure Probability

机译:预期失效概率的稳定利润共享钢筋儿

获取原文

摘要

In this paper, Expected Success Probability (ESP) is defined and a reinforcement learning method Stable Profit Sharing with Expected Failure Probability (SPSwithEFP) is proposed. In SPSwith-EFP, Expected Failure Probability (EFP) is used in the roulette wheel selection method and ESP is used in the update equation of the weight of a rule. EFP can discard risky actions and ESP can make the distribution of learned results smaller. The effectiveness is shown with simulation experiments for a maze environment with pitfalls.
机译:在本文中,提出了预期的成功概率(ESP),提出了一种强化学习方法稳定利润共享与预期失效概率(SPSWITHEFP)。在Spscith-EFP中,预期的失效概率(EFP)用于轮盘键式选择方法,ESP用于规则重量的更新方程。 EFP可以丢弃危险的行动,ESP可以使学习结果的分布更小。显示了迷宫环境的仿真实验,陷阱环境。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号