首页> 外文会议>IEEE Conference on Computer Communications Workshops >Delay-Optimal Traffic Engineering through Multi-agent Reinforcement Learning
【24h】

Delay-Optimal Traffic Engineering through Multi-agent Reinforcement Learning

机译:通过多主体强化学习进行时延最优交通工程

获取原文

摘要

Traffic engineering is one of the most important methods of optimizing network performance by designing optimal forwarding and routing rules to meet the quality of service (QoS) requirements for a large volume of traffic flows. End-to-end (E2E) delay is one of the key TE metrics. Optimizing E2E delay, however, is very challenging in large-scale multihop networks due to the profound network uncertainties and dynamics. This paper proposes a model-free TE framework that adopts multi-agent reinforcement learning for distributed control to minimize the E2E delay. In particular, distributed TE is formulated as a multi-agent extension of Markov decision process (MA-MDP). To solve this problem, a modular and composable learning framework is proposed, which consists of three interleaving modules including policy evaluation, policy improvement, and policy execution. Each of component can be implemented using different algorithms along with their extensions. Simulation results show that the combination of several extensions, such as double learning, expected policy evaluation, and on-policy learning, can provide superior E2E delay performance under high traffic load cases.
机译:通过设计最佳的转发和路由规则来满足大量流量的服务质量(QoS)要求,流量工程是优化网络性能的最重要方法之一。端到端(E2E)延迟是关键的TE指标之一。然而,由于巨大的网络不确定性和动态性,在大型多跳网络中优化E2E延迟非常具有挑战性。本文提出了一种无模型的TE框架,该框架采用多智能体强化学习进行分布式控制,以最大程度地降低E2E延迟。特别地,将分布式TE公式化为Markov决策过程(MA-MDP)的多主体扩展。为了解决这个问题,提出了一种模块化且可组合的学习框架,该框架由三个交错模块组成,包括策略评估,策略改进和策略执行。每个组件都可以使用不同的算法及其扩展来实现。仿真结果表明,双重学习,预期策略评估和策略学习等多种扩展的组合可以在高流量负载情况下提供出色的端到端延迟性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号