首页> 外国专利> REINFORCEMENT LEARNING EXPLORATION BY EXPLOITING PAST EXPERIENCES FOR CRITICAL EVENTS

REINFORCEMENT LEARNING EXPLORATION BY EXPLOITING PAST EXPERIENCES FOR CRITICAL EVENTS

机译:通过探索关键事件的过去经验来进行强化学习探索

摘要

A computer-implemented method is provided for reinforcement learning performed by a processor. The method includes obtaining, from an environment, a given experience that includes an action, a state and a reward. The method further includes storing the given experience in an experience buffer responsive to a value of the reward included in the given experience exceeding a first threshold. The method also includes responsive to obtaining another experience having another reward that less than or equal to the first threshold, searching the experience buffer for a candidate experience with a similar state to the other experience and copying the candidate experience into an event buffer. The method additionally includes during exploration, selecting an action to be taken to the environment from the event buffer with a predetermined probability.
机译:提供了一种用于由处理器执行的强化学习的计算机实现的方法。该方法包括从环境获得包括动作,状态和奖励的给定体验。该方法还包括:响应于包括在给定经验中的奖励的值超过第一阈值,将给定经验存储在经验缓冲器中。该方法还包括响应于获得具有小于或等于第一阈值的另一奖励的另一体验,在体验缓冲器中搜索具有与另一体验相似的状态的候选体验,并将该候选体验复制到事件缓冲器中。该方法另外包括在探索期间,以预定概率从事件缓冲器中选择要对环境采取的动作。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号