首页> 外文会议>International conference on soft computing >MarcoPolo: A Reinforcement Learning System considering tradeoff exploration and exploitation under Marcovian Environments
【24h】

MarcoPolo: A Reinforcement Learning System considering tradeoff exploration and exploitation under Marcovian Environments

机译:MarCopolo:考虑Marcovian环境下的权衡探索和剥削的加强学习系统

获取原文

摘要

Reinforcement learning is a kind of machine learning. It aims to adapt an agent to a given environment with a clue to rewards. We consider that ideal reinforcement learning systems are to get some rewards even at an early learning systems are to get some rewards even at an early learning phase and to get more rewards as exploration of the environment propceeds. In this paper, we propose a unified learning system: MarcoPolo that takes account of both getting rewards by Profit Sharing or Policy Iteration and exploring the environment by k-Certainty Exploration Method. MarcoPolo can realize any tradeoff between exploitation and exploration through whole learning processes. By applying MarocPolo to numerical examples, its effectiveness is shown.
机译:强化学习是一种机器学习。 它旨在将代理调整到给定的环境与线索奖励。 我们认为,即使在早期的学习系统中,即使在早期的学习阶段,也可以获得一些奖励的理想加固学习系统,并获得更多奖励作为环境的探索。 在本文中,我们提出了一个统一的学习系统:Marcopolo考虑到盈利共享或政策迭代的奖励,并通过K-Cerlainty探索方法探索环境。 马隆波洛可以通过整个学习过程实现剥削与勘探之间的任何权衡。 通过将Marocpolo施加到数值例子,显示其有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号