首页> 外文期刊>Neurocomputing >Online optimal control of unknown discrete-time nonlinear systems by using time-based adaptive dynamic programming
【24h】

Online optimal control of unknown discrete-time nonlinear systems by using time-based adaptive dynamic programming

机译:基于时间自适应动态规划的未知离散非线性系统在线最优控制

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, an online optimal control scheme for a class of unknown discrete-time (DT) nonlinear systems is developed. The proposed algorithm using current and recorded data to obtain the optimal controller without the knowledge of system dynamics. In order to carry out the algorithm, a neural network (NN) is constructed to identify the unknown system. Then, based on the estimated system model, a novel time-based ADP algorithm without using system dynamics is implemented on an actor-critic structure. Two NNs are used in the structure to generate the optimal cost and the optimal control policy, and both of them are updated once at the sampling instant and thus the algorithm can be regarded as time-based. The persistence of excitation condition, which is generally required in adaptive control, is ensured by a new criterion while using current and recorded data in the update of the critic neural network. Lyapunov techniques are used to show that system states, cost function and control signals are all uniformly ultimately bounded (UUB) with small bounded errors while explicitly considering the approximation errors caused by the three NNs. Finally, simulation results are provided to verify the effectiveness of the proposed approach. (C) 2015 Elsevier B.V. All rights reserved.
机译:本文针对一类未知离散时间(DT)非线性系统,提出了一种在线最优控制方案。所提出的算法使用当前和记录的数据来获得最优控制器,而无需了解系统动力学。为了执行该算法,构造了神经网络(NN)以识别未知系统。然后,基于估计的系统模型,在演员评论结构上实现了一种新的基于时间的ADP算法,该算法无需使用系统动力学。在该结构中使用两个神经网络来生成最优成本和最优控制策略,并且它们都在采样时刻更新一次,因此该算法可以被视为基于时间的。自适应控制中通常需要的激励条件的持久性由新的标准来保证,同时在批判者神经网络的更新中使用当前和记录的数据。 Lyapunov技术用于显示系统状态,成本函数和控制信号均具有较小的有界误差,并且最终都均匀地有界(UUB),同时明确考虑了由三个NN引起的近似误差。最后,提供仿真结果以验证所提出方法的有效性。 (C)2015 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2015年第1期|163-170|共8页
  • 作者单位

    Northeastern Univ, Natl Educ Minist, Key Lab Integrated Automat Proc Ind, Shenyang 110004, Peoples R China.;

    Northeastern Univ, Natl Educ Minist, Key Lab Integrated Automat Proc Ind, Shenyang 110004, Peoples R China.;

    Northeastern Univ, Natl Educ Minist, Key Lab Integrated Automat Proc Ind, Shenyang 110004, Peoples R China.;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Adaptive dynamic programming; Online optimal control; Reinforcement learning; Discrete-time systems;

    机译:自适应动态规划;在线最优控制;强化学习;离散时间系统;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号