首页> 外文期刊>Mathematical Programming >First-order algorithm with ${mathcal{O}({rm ln}(1{/}epsilon))}$ convergence for ${epsilon}$ -equilibrium in two-person zero-sum games
【24h】

First-order algorithm with ${mathcal{O}({rm ln}(1{/}epsilon))}$ convergence for ${epsilon}$ -equilibrium in two-person zero-sum games

机译:两人零和游戏中$ {epsilon} $均衡的具有$ {mathcal {O}({rm ln}(1 {/} epsilon)}} $收敛的一阶算法

获取原文
获取原文并翻译 | 示例
           

摘要

We propose an iterated version of Nesterov’s first-order smoothing method for the two-person zero-sum game equilibrium problem $$min_{x in Q_1}max_{y in Q_2} {x}^{rm T}{Ay} = max_{y in Q_2} min_{x in Q_1} {x}^{rm T}{Ay}.$$ This formulation applies to matrix games as well as sequential games. Our new algorithmic scheme computes an ${epsilon}$ -equilibrium to this min-max problem in ${mathcal {O}left(frac{|A|}{delta(A)} , {rm ln}(1{/}epsilon)right)}$ first-order iterations, where δ(A) is a certain condition measure of the matrix A. This improves upon the previous first-order methods which required ${mathcal {O}(1{/}epsilon)}$ iterations, and it matches the iteration complexity bound of interior-point methods in terms of the algorithm’s dependence on ${epsilon}$ . Unlike interior-point methods that are inapplicable to large games due to their memory requirements, our algorithm retains the small memory requirements of prior first-order methods. Our scheme supplements Nesterov’s method with an outer loop that lowers the target ${epsilon}$ between iterations (this target affects the amount of smoothing in the inner loop). Computational experiments both in matrix games and sequential games show that a significant speed improvement is obtained in practice as well, and the relative speed improvement increases with the desired accuracy (as suggested by the complexity bounds).
机译:针对两人零和游戏平衡问题$$ min_ {x,Q_1} max_ {y,Q_2},{x} ^ {rm T} {Ay} = max_,我们提出了Nesterov一阶平滑方法的迭代版本。 {Q_2中的y} Q_1中的min_ {x} {x} ^ {rm T} {Ay}。$$此公式适用于矩阵游戏以及顺序游戏。我们的新算法方案为此$ {mathcal {O} left(frac {| A |} {delta(A)},{rm ln}(1 {/}}中的最小-最大值问题计算一个$ {epsilon} $平衡点epsilon)(右)} $一阶迭代,其中δ(A)是矩阵A的某种条件量度。这比以前的需要$ {mathcal {O}(1 {/} epsilon)的一阶方法有所改进。 } $次迭代,并且根据算法对$ {epsilon} $的依赖性,它与内部点方法的迭代复杂度范围匹配。与由于内存需求而不适用于大型游戏的内部点方法不同,我们的算法保留了先前一阶方法的较小内存需求。我们的方案对Nesterov的方法进行了补充,增加了一个外部循环,该循环降低了两次迭代之间的目标$ {epsilon} $(此目标会影响内部循环中的平滑程度)。在矩阵博弈和顺序博弈中的计算实验均表明,在实践中也获得了显着的速度改进,并且相对速度的改进也随所需的精度而增加(如复杂性界限所建议)。

著录项

  • 来源
    《Mathematical Programming》 |2012年第2期|p.279-298|共20页
  • 作者单位

    Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA;

    Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA, USA;

    Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    91A05; 90C33; 90C47; 90C52;

    机译:91 A05,90电容90 A47 S;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号