首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games
【24h】

Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games

机译:坦白地说:零和游戏的最佳无遗憾框架

获取原文
       

摘要

We revisit the problem of solving two-player zero-sum games in the decentralized setting. We propose a simple algorithmic framework that simultaneously achieves the best rates for honest regret as well as adversarial regret, and in addition resolves the open problem of removing the logarithmic terms in convergence to the value of the game. We achieve this goal in three steps. First, we provide a novel analysis of the optimistic mirror descent (OMD), showing that it can be modified to guarantee fast convergence for both honest regret and value of the game, when the players are playing collaboratively. Second, we propose a new algorithm, dubbed as robust optimistic mirror descent (ROMD), which attains optimal adversarial regret without knowing the time horizon beforehand. Finally, we propose a simple signaling scheme, which enables us to bridge OMD and ROMD to achieve the best of both worlds. Numerical examples are presented to support our theoretical claims and show that our non-adaptive ROMD algorithm can be competitive to OMD with adaptive step-size selection.
机译:我们重新讨论在分散的环境中解决两人零和游戏的问题。我们提出了一个简单的算法框架,该框架可同时实现诚实后悔和对抗后悔的最佳比率,此外还解决了消除对数项以达到游戏价值的开放性问题。我们分三步实现这一目标。首先,我们对乐观镜像后裔(OMD)进行了新颖的分析,显示了可以进行修改,以确保在玩家进行协作游戏时,既可以为诚实的遗憾又可以为游戏的价值提供快速的融合。其次,我们提出了一种称为鲁棒乐观镜像下降(ROMD)的新算法,该算法无需事先了解时间范围即可获得最佳的对抗遗憾。最后,我们提出了一种简单的信令方案,该方案使我们能够桥接OMD和ROMD以实现两全其美。数值例子证明了我们的理论主张,并表明我们的非自适应ROMD算法在自适应步长选择下可以与OMD竞争。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号