...
首页> 外文期刊>Applied mathematics and optimization >The Discounted Method and Equivalence of Average Criteria for Risk-Sensitive Markov Decision Processes on Borel Spaces
【24h】

The Discounted Method and Equivalence of Average Criteria for Risk-Sensitive Markov Decision Processes on Borel Spaces

机译:Borel空间上风险敏感型Markov决策过程的折现方法和平均准则的等价性

获取原文
获取原文并翻译 | 示例
           

摘要

This note concerns discrete-time controlled Markov chains with Borel state and action spaces. Given a nonnegative cost function, the performance of a control policy is measured by the superior limit risk-sensitive average criterion associated with a constant and positive risk sensitivity coefficient. Within such a framework, the discounted approach is used (a) to establish the existence of solutions for the corresponding optimality inequality, and (b) to show that, under mild conditions on the cost function, the optimal value functions corresponding to the superior and inferior limit average criteria coincide on a certain subset of the state space. The approach of the paper relies on standard dynamic programming ideas and on a simple analytical derivation of a Tauberian relation.
机译:本说明涉及具有Borel状态空间和动作空间的离散时间受控Markov链。在给定非负成本函数的情况下,控制策略的绩效由与风险敏感系数恒定和正相关的上限值风险敏感平均标准来衡量。在这样的框架内,使用折现法(a)建立相应最优不等式的解的存在,以及(b)表明在成本函数的温和条件下,最优值函数对应于最优函数和最优函数。下限平均标准在状态空间的某个子集上重合。本文的方法依赖于标准动态编程思想以及对Tauberian关系的简单分析推导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号