首页> 外国专利> REINFORCEMENT LEARNING EXPLORATION BY EXPLOITING PAST EXPERIENCES FOR CRITICAL EVENTS

REINFORCEMENT LEARNING EXPLORATION BY EXPLOITING PAST EXPERIENCES FOR CRITICAL EVENTS

机译：通过探索关键事件的过去经验来进行强化学习探索

页面导航

摘要
著录项
相似文献

摘要

A computer-implemented method is provided for reinforcement learning performed by a processor. The method includes obtaining, from an environment, a given experience that includes an action, a state and a reward. The method further includes storing the given experience in an experience buffer responsive to a value of the reward included in the given experience exceeding a first threshold. The method also includes responsive to obtaining another experience having another reward that less than or equal to the first threshold, searching the experience buffer for a candidate experience with a similar state to the other experience and copying the candidate experience into an event buffer. The method additionally includes during exploration, selecting an action to be taken to the environment from the event buffer with a predetermined probability.

机译：提供了一种用于由处理器执行的强化学习的计算机实现的方法。该方法包括从环境获得包括动作，状态和奖励的给定体验。该方法还包括：响应于包括在给定经验中的奖励的值超过第一阈值，将给定经验存储在经验缓冲器中。该方法还包括响应于获得具有小于或等于第一阈值的另一奖励的另一体验，在体验缓冲器中搜索具有与另一体验相似的状态的候选体验，并将该候选体验复制到事件缓冲器中。该方法另外包括在探索期间，以预定概率从事件缓冲器中选择要对环境采取的动作。

著录项

公开/公告号US2019385091A1

专利类型
公开/公告日2019-12-19

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US201816009815
发明设计人 ASIM MUNAWAR;GIOVANNI DE MAGISTRIS;RYUKI TACHIBANA;
展开▼

申请日2018-06-15
分类号G06N99;G06N7;
国家 US
入库时间 2022-08-21 11:23:48

相似文献

专利
外文文献
中文文献