首页> 外文会议>International Conference on Algorithmic Learning Theory >On the Prior Sensitivity of Thompson Sampling
【24h】

On the Prior Sensitivity of Thompson Sampling

机译:论汤普森采样的现有敏感性

获取原文

摘要

The empirically successful Thompson Sampling algorithm for stochastic bandits has drawn much interest in understanding its theoretical properties. One important benefit of the algorithm is that it allows domain knowledge to be conveniently encoded as a prior distribution to balance exploration and exploitation more effectively. While it is generally believed that the algorithm's regret is low (high) when the prior is good (bad), little is known about the exact dependence. This paper is a first step towards answering this important question: focusing on a special yet representative case, we fully characterize the algorithm's worst-case dependence of regret on the choice of prior. As a corollary, these results also provide useful insights into the general sensitivity of the algorithm to the choice of priors, when no structural assumptions are made. In particular, with p being the prior probability mass of the true reward-generating model, we prove O({the square root of}(T/p)) and O({the square root of}((1-p)T)) regret upper bounds for the poor- and good-prior cases, respectively, as well as matching lower bounds. Our proofs rely on a fundamental property of Thompson Sampling and make heavy use of martingale theory, both of which appear novel in the Thompson-Sampling literature and may be useful for studying other behavior of the algorithm.
机译:用于随机匪的经验成功的汤普森采样算法对理解其理论特性引起了很多兴趣。算法的一个重要益处是它允许域知识方便地编码作为先前的分配,以更有效地平衡探索和开发。虽然通常认为算法的后悔很低(高),当之前的好(坏)时,关于确切依赖性的知之甚少。本文是回答这个重要问题的第一步:专注于特殊但代表性的情况,我们完全表征了算法对先前选择对遗憾的最坏情况依赖。作为一种必然的,这些结果还提供了有用的见解,进入算法的一般敏感性,以便在没有结构假设的情况下选择前提。特别是,对于P是真正奖励生成模型的现有概率质量,我们证明O({}(t / p))和o({平方根}((1-p)t ))分别为贫困和良好情况以及匹配下限的遗憾,遗憾的遗憾。我们的证据依赖于汤普森采样的基本属性,并致力于鞅理论,这两者都在汤普森采样文献中出现了新颖,对研究算法的其他行为有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号