首页> 外文会议>Annual conference on Neural Information Processing Systems >Faster Projection-free Convex Optimization over the Spectrahedron
【24h】

Faster Projection-free Convex Optimization over the Spectrahedron

机译:光谱面体上更快的无投影凸优化

获取原文

摘要

Minimizing a convex function over the spectrahedron, i.e., the set of all d x d positive semidefinite matrices with unit trace, is an important optimization task with many applications in optimization, machine learning, and signal processing. It is also notoriously difficult to solve in large-scale since standard techniques require to compute expensive matrix decompositions. An alternative is the conditional gradient method (aka Frank-Wolfe algorithm) that regained much interest in recent years, mostly due to its application to this specific setting. The key benefit of the CG method is that it avoids expensive matrix decompositions all together, and simply requires a single eigenvector computation per iteration, which is much more efficient. On the downside, the CG method, in general, converges with an inferior rate. The error for minimizing a β-smooth function after t iterations scales like β/t. This rate does not improve even if the function is also strongly convex. In this work we present a modification of the CG method tailored for the spectrahedron. The per-iteration complexity of the method is essentially identical to that of the standard CG method: only a single eigenvector computation is required. For minimizing an a-strongly convex and β-smooth function, the expected error of the method after t iterations is: O(min{β/t,(β(rank(X*))/(α~(1/4)t)~(4/3),(β/(αλ_(min)(X*)t))~2}) where rank(X*), λ_(min)(X*) are the rank of the optimal solution and smallest nonzero eigenvalue, respectively. Beyond the significant improvement in convergence rate, it also follows that when the optimum is low-rank, our method provides better accuracy-rank tradeoff than the standard CG method. To the best of our knowledge, this is the first result that attains provably faster convergence rates for a CG variant for optimization over the spectrahedron. We also present encouraging preliminary empirical results.
机译:最小化频谱面的凸函数(即具有单元迹线的所有d x d个正半定矩阵的集合)是一项重要的优化任务,在优化,机器学习和信号处理中有许多应用。众所周知,由于标准技术需要计算昂贵的矩阵分解,因此很难大规模解决。一种替代方法是条件梯度方法(又名Frank-Wolfe算法),近年来引起了人们的极大兴趣,这主要是由于其在此特定设置中的应用。 CG方法的主要优点在于,它避免了昂贵的矩阵分解,并且每次迭代仅需要一次特征向量计算,效率更高。不利的一面是,CG方法通常以较低的速度收敛。在t次迭代后最小化β平滑函数的误差与β/ t一样小。即使该函数也是强凸的,该比率也不会提高。在这项工作中,我们提出了一种为光谱体量身定制的CG方法的改进方案。该方法的逐次迭代复杂度与标准CG方法的逐次迭代复杂度基本相同:仅需要单个特征向量计算。为了最小化a-强凸函数和β平滑函数,该方法在t次迭代后的预期误差为:O(min {β/ t,(β(rank(X *))/(α〜(1/4)) t)〜(4/3),(β/(αλ_(min)(X *)t))〜2}),其中rank(X *),λ_(min)(X *)是最优解的等级和最小的非零特征值,除了收敛速度的显着提高之外,还可以得出结论,当最优值是低秩时,我们的方法比标准CG方法提供了更好的精度-秩权衡。第一个结果可证明CG变体的收敛速度更快,可以在频谱三面体上进行优化,我们还提供了令人鼓舞的初步经验结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号