首页> 外文学位 >Adaptive online optimization of Markov reward processes with application to pricing of multiclass loss network services.
【24h】

Adaptive online optimization of Markov reward processes with application to pricing of multiclass loss network services.

机译:马尔可夫奖励过程的自适应在线优化及其在多类亏损网络服务定价中的应用。

获取原文
获取原文并翻译 | 示例

摘要

This work studies the problem of adaptive online optimization of Markov reward processes. The problem at hand is the following: given a Markov chain whose transition probability matrix and its expected cost per stage are functions of a (1) a set of tunable parameters, and (2) a set of unknown but fixed parameters, find the set of (tunable) parameters that maximizes the average reward per stage observed. This work introduces techniques that improve the performance of existing simulation-based methods, and that are robust to uncertainty of the system parameters. We show the almost sure convergence of the algorithms to locally optimal values, including the adaptive case, while the tracking ability of the adaptive algorithm is illustrated numerically.; The methodological work in online methods is applied to a significant optimization problem, namely the problem of setting prices for services in a multiclass loss networks. Such networks consists of a set of resources shared by multiple classes of users characterized by their usage patterns. The network sets the price per-call/per-class and it is assumed that users are sensitive to prices, in the sense that prices affect the arrival process. The algorithms developed here are applied to the solution to this problem. The tracking ability of the algorithms is illustrated by scenarios where the service time parameters change smoothly, or infrequently, over time.
机译:这项工作研究了马尔可夫奖励过程的自适应在线优化问题。当前存在的问题如下:给定一个马尔可夫链,其转移概率矩阵及其每个阶段的预期成本是(1)一组可调参数和(2)一组未知但固定的参数的函数,请找到该集合(可调)参数,以使观察到的每个阶段的平均回报最大化。这项工作介绍了一些技术,这些技术可以改善现有基于仿真的方法的性能,并且对系统参数的不确定性具有鲁棒性。我们展示了算法到局部最优值(包括自适应情况)的几乎确定的收敛性,而自适应算法的跟踪能力用数字表示。在线方法中的方法工作被应用于一个重大的优化问题,即在多类损失网络中为服务定价的问题。这样的网络由一组资源组成,这些资源由以其使用模式为特征的多类用户共享。该网络设置每个呼叫/每个类别的价格,并假设用户对价格敏感,因为价格会影响到达过程。此处开发的算法适用于此问题的解决方案。通过服务时间参数随时间平滑地或不频繁地变化的场景来说明算法的跟踪能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号