首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Corrupted Contextual Bandits: Online Learning with Corrupted Context
【24h】

Corrupted Contextual Bandits: Online Learning with Corrupted Context

机译:损坏的上下文匪徒:与上下文损坏的在线学习

获取原文

摘要

We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the context used at each decision may be corrupted ("useless context"). This new problem is motivated by certain on-line settings including clinical trial and ad recommendation applications. In order to address the corrupted-context setting, we propose to combine the standard contextual bandit approach with a classical multi-armed bandit mechanism. Unlike standard contextual bandit methods, we are able to learn from all iteration, even those with corrupted context, by improving the computing of the expectation for each arm. Promising empirical results are obtained on several real-life datasets.
机译:我们考虑一个新颖的语境强盗问题的新变种(即,具有侧面信息的多武装强盗,或者上下文,可用于决策者),其中每个决定中使用的上下文可能被破坏(“无用的上下文”)。 这一新问题是由某些在线设置的动机,包括临床试验和广告推荐应用。 为了解决损坏的上下文设置,我们建议将标准的上下文强盗方法与经典的多武装强盗机构相结合。 与标准的上下文强盗方法不同,我们能够通过改善对每个臂的期望的计算来学习甚至具有损坏的上下文的迭代。 有希望的经验结果是在几个现实生活数据集上获得的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号