...
首页> 外文期刊>International Journal of Adaptive Control and Signal Processing >Reinforcement learning based closed-loop reference model adaptive flight control system design
【24h】

Reinforcement learning based closed-loop reference model adaptive flight control system design

机译:基于加强学习的闭环参考模型自适应飞行控制系统设计

获取原文
获取原文并翻译 | 示例
           

摘要

In this study, we present a reinforcement learning (RL)-based flight control system design method to improve the transient response performance of a closed-loop reference model (CRM) adaptive control system. The methodology, known as RL-CRM, relies on the generation of a dynamic adaption strategy by implementing RL on the variable factor in the feedback path gain matrix of the reference model. An actor-critic RL agent is designed using the performance-driven reward functions and tracking error observations from the environment. In the training phase, a deep deterministic policy gradient algorithm is utilized to learn the time-varying adaptation strategy of the design parameter in the reference model feedback gain matrix. The proposed control structure provides the possibility to learn numerous adaptation strategies across a wide range of flight and vehicle conditions instead of being driven by high-fidelity simulators or flight testing and real flight operations. The performance of the proposed system was evaluated on an identified and verified mathematical model of an agile quadrotor platform. Monte-Carlo simulations and worst case analysis were also performed over a benchmark helicopter example model. In comparison to the classical model reference adaptive control and CRM-adaptive control system designs, the proposed RL-CRM adaptive flight control system design improves the transient response performance on all associated metrics and provides the capability to operate over a wide range of parametric uncertainties.
机译:在这项研究中,我们提出了一种加强学习(RL)的飞行控制系统设计方法,以改善闭环参考模型(CRM)自适应控制系统的瞬态响应性能。作为RL-CRM的方法,依赖于通过在参考模型的反馈路径增益矩阵中的可变因子上实现RL来产生动态自适应策略。演员 - 评论家RL代理是使用性能驱动的奖励函数设计的,并从环境中跟踪错误观察。在训练阶段,利用深度确定性政策梯度算法来学习参考模型反馈增益矩阵中设计参数的时变自适应策略。该拟议的控制结构提供了在广泛的飞行和车辆条件下学习众多适应策略,而不是由高保真模拟器或飞行测试和实际飞行业务驱动。在Agile四电场平台的识别和验证的数学模型中评估了所提出的系统的性能。在基准直升机示例模型中也进行了Monte-Carlo模拟和最坏的情况分析。与经典模型参考自适应控制和CRM自适应控制系统设计相比,所提出的RL-CRM自适应飞行控制系统设计可以提高所有相关度量的瞬态响应性能,并提供在广泛的参数不确定性范围内操作的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号