首页> 外国专利> Convergent actor critic-based fuzzy reinforcement learning apparatus and method

Convergent actor critic-based fuzzy reinforcement learning apparatus and method

机译：基于融合演员评论家的模糊强化学习装置及方法

页面导航

摘要
著录项
相似文献

摘要

A system is controlled by an actor-critic based fuzzy reinforcement learning algorithm that provides instructions to a processor of the system for applying actor-critic based fuzzy reinforcement learning. The system includes a database of fuzzy-logic rules for mapping input data to output commands for modifying a system state, and a reinforcement learning algorithm for updating the fuzzy-logic rules database based on effects on the system state of the output commands mapped from the input data. The reinforcement learning algorithm is configured to converge at least one parameter of the system state to at least approximately an optimum value following multiple mapping and updating iterations. The reinforcement learning algorithm may be based on an update equation including a derivative with respect to said at least one parameter of a logarithm of a probability function for taking a selected action when a selected state is encountered. The reinforcement learning algorithm may be configured to update the at least one parameter based on said update equation. The system may include a wireless transmitter.

机译：系统由基于行为者批评的模糊强化学习算法控制，该算法向系统的处理器提供指令以应用基于行为者批评的模糊强化学习。该系统包括用于将输入数据映射到用于修改系统状态的输出命令的模糊逻辑规则的数据库，以及用于基于从逻辑单元映射的输出命令对系统状态的影响来更新模糊逻辑规则数据库的强化学习算法。输入数据。强化学习算法被配置为在多次映射和更新迭代之后将系统状态的至少一个参数收敛到至少近似最佳值。强化学习算法可以基于更新方程，该更新方程包括相对于在遇到选定状态时采取选定动作的概率函数的对数的所述至少一个参数的导数。增强学习算法可以被配置为基于所述更新方程来更新至少一个参数。该系统可以包括无线发射机。

著录项

公开/公告号US2002198854A1

专利类型
公开/公告日2002-12-26

原文格式PDF
申请/专利权人 BERENJI HAMID R.;VENGROV DAVID;
展开▼

申请/专利号US20010027817
发明设计人 DAVID VENGROV;HAMID R. BERENJI;
展开▼

申请日2001-12-21
分类号G06F15/18;
国家 US
入库时间 2022-08-22 00:08:47

相似文献

专利
外文文献
中文文献