首页> 外文会议>BICA Society., Meeting >On Stable Profit Sharing Reinforcement Learning with Expected Failure Probability

【24h】

On Stable Profit Sharing Reinforcement Learning with Expected Failure Probability

机译：预期失效概率的稳定利润共享钢筋儿

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, Expected Success Probability (ESP) is defined and a reinforcement learning method Stable Profit Sharing with Expected Failure Probability (SPSwithEFP) is proposed. In SPSwith-EFP, Expected Failure Probability (EFP) is used in the roulette wheel selection method and ESP is used in the update equation of the weight of a rule. EFP can discard risky actions and ESP can make the distribution of learned results smaller. The effectiveness is shown with simulation experiments for a maze environment with pitfalls.

机译：在本文中，提出了预期的成功概率（ESP），提出了一种强化学习方法稳定利润共享与预期失效概率（SPSWITHEFP）。在Spscith-EFP中，预期的失效概率（EFP）用于轮盘键式选择方法，ESP用于规则重量的更新方程。 EFP可以丢弃危险的行动，ESP可以使学习结果的分布更小。显示了迷宫环境的仿真实验，陷阱环境。

著录项

来源
《BICA Society., Meeting》|2019年|xv 362 pages|共6页
会议地点
作者
Daisuke Mizuno; Kazuteru Miyazaki; Hiroaki Kobayashi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-532;
关键词
Reinforcement learning; XoL; Profit Sharing; EFP;

机译：加强学习;XOL;利润分享;EFP;

相似文献

外文文献
中文文献
专利

1. Maximizing the probability of realizing profit targets versus maximizing expected profits: A reconciliation to resolve an agency problem [J] . Kamrad Bardia, Ord Keith, Schmidt Glen M. International journal of production economics . 2021,第Auga期

机译：最大化实现利润目标的概率与最大化预期利润：解决原子能机构问题的和解
2. A novel shared-link protection algorithm with correlated link failure probability for dual-link failure [J] . Xu Jun, Chang HuiYou, Xu Chang, Photonic network communications . 2010,第1期

机译：具有双链路故障的相关链路故障概率的新型共享链路保护算法
3. A Reinforcement Learning Method Using a Dynamic Reinforcement Function Based on Action Selection Probability [J] . Yugo Hasegawa, Satoko Takada, Hidehiro Nakano, Systems and Computers in Japan . 2007,第7期

机译：基于动作选择概率的动态强化函数强化学习方法
4. On Stable Profit Sharing Reinforcement Learning with Expected Failure Probability [C] . Daisuke Mizuno, Kazuteru Miyazaki, Hiroaki Kobayashi BICA Society., Meeting . 2019

机译：预期失效概率的稳定利润共享钢筋儿
5. Rebalancing Shared Mobility Systems by User Incentive Scheme via Reinforcement Learning [D] . Schofield, Matthew. 2021

机译：通过增强学习通过用户激励计划重新平衡共享移动系统
6. Probability learning as a function of momentary reinforcement probability [O] . Ben A. Williams 1972

机译：概率学习与瞬时强化概率的关系
7. The effect of relaxation of interstate banking restrictions on the probability of bank failures and the expected value of FDIC liabilities [O] . Bisenius, Donald John 1985

机译：放松州际银行业限制对银行倒闭概率和FDIC负债预期值的影响

On Stable Profit Sharing Reinforcement Learning with Expected Failure Probability

摘要

著录项

相似文献

相关主题

期刊订阅