首页> 外国专利> METHOD FOR INTERMEDIATE MODEL GENERATION USING HISTORICAL DATA AND DOMAIN KNOWLEDGE FOR RL TRAINING

METHOD FOR INTERMEDIATE MODEL GENERATION USING HISTORICAL DATA AND DOMAIN KNOWLEDGE FOR RL TRAINING

机译：基于历史数据和领域知识的RL训练中间模型生成方法

页面导航

摘要
著录项
相似文献

摘要

Embodiments may include novel techniques for intermediate model generation using historical data and domain knowledge for Reinforcement Learning (RL) training. Embodiments may start with gathering client data. For example, in an embodiment, a method, implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, may comprise identifying historical data and domain knowledge of a client including mathematical properties of features, generating an intermediate model comprising a probabilistic description of the environment, such as an MDP graph or transition probability matrix based on the identified historical data and domain knowledge, training a Reinforcement Learning (RL)/Deep Reinforcement Learning (DRL) model using the generated intermediate model, and deploying the trained Reinforcement Learning (RL)/Deep Reinforcement Learning (DRL) model and continuing training the trained Reinforcement Learning (RL)/Deep Reinforcement Learning (DRL) model from a real environment.

机译：实施例可以包括使用历史数据和强化学习（RL）训练的领域知识来生成中间模型的新技术。实施例可以从收集客户端数据开始。例如，在一个实施例中，在包括处理器、处理器可访问的存储器以及存储在存储器中并可由处理器执行的计算机程序指令的计算机系统中实现的方法可以包括识别客户机的历史数据和领域知识，包括特征的数学属性，生成包含环境概率描述的中间模型，例如基于识别的历史数据和领域知识的MDP图或转移概率矩阵，使用生成的中间模型训练强化学习（RL）/深度强化学习（DRL）模型，部署训练强化学习（RL）/深度强化学习（DRL）模型，并在真实环境中继续训练训练训练强化学习（RL）/深度强化学习（DRL）模型。

著录项

公开/公告号US2022114407A1

专利类型
公开/公告日2022-04-14

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US202017068116
发明设计人 ALEXANDER ZADOROJNIY;
展开▼

申请日2020-10-12
分类号G06K9/62;G06N20;
国家 US
入库时间 2022-08-25 00:27:01

相似文献

专利
外文文献
中文文献