Sequential Transfer in Multi-armed Bandit with Finite Set of Models

机译：具有有限模型的多武装强盗中的顺序转移

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly improve the learning performance, most of the literature on transfer is focused on batch learning tasks. In this paper we study the problem of sequential transfer in online learning, notably in the multi-armed bandit framework, where the objective is to minimize the total regret over a sequence of tasks by transferring knowledge from prior tasks. We introduce a novel bandit algorithm based on a method-of-moments approach for estimating the possible tasks and derive regret bounds for it.

机译：从先前的任务中学习并转移该体验以提高未来的业绩对于建设终身学习代理至关重要。虽然导致监督和强化学习表明，转让可能会显着提高学习绩效，但大多数文献转移都会致力于批量学习任务。在本文中，我们研究了在线学习中顺序转移的问题，特别是在多武装强盗框架中，目的是通过从先前任务转移知识来最小化通过传输知识的一系列任务序列的总遗憾。我们介绍了一种新的强盗算法，基于一种矩阵方法，用于估计可能的任务并导出遗憾的界限。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2013年||共9页
会议地点
作者
Mohammad Gheshlaghi Azar; Alessandro Lazaric; Emma Brunskill;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Priority index heuristic for multi-armed bandit problems with set-up costs and/or set-up time delays [J] . F. DUSONCHET, M.-O. HONGLER International Journal of Computer Integrated Manufacturing . 2006,第3期

机译：具有设置成本和/或设置时间延迟的多臂匪问题的优先级指标启发式
2. Bounded Rationality in Las Vegas Probabilistic Finite Automata Play Multi-Armed Bandits [J] . Xinming Liu, Joseph Halpern JMLR: Workshop and Conference Proceedings . 2020,第2010期

机译：拉斯维加斯概率有限自动机的有界合理性扮演多武装匪
3. AN ASYMPTOTICALLY OPTIMAL HEURISTIC FOR GENERAL NONSTATIONARY FINITE-HORIZON RESTLESS MULTI-ARMED, MULTI-ACTION BANDITS [J] . Zayas-Caban Gabriel, Jasin Stefanus, Wang Guihua Advances in applied probability . 2019,第3期

机译：一般非平稳有限范围不安的多武装，多动作匪徒的渐近最优启发式
4. Sequential Transfer in Multi-armed Bandit with Finite Set of Models [C] . Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill Annual conference on Neural Information Processing Systems . 2013

机译：带有有限模型集的多臂强盗的顺序转移
5. Essays on sequential analysis: Multi-armed bandit with availability constraints and sequential change detection and identification. [D] . Yamazaki, Kazutoshi. 2009

机译：关于顺序分析的文章：具有可用性约束以及顺序更改检测和识别的多臂匪。
6. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges [O] . Sofía S. Villar, Jack Bowden, James Wason -1

机译：用于临床试验优化设计的多臂Bandit模型：好处和挑战
7. Sequential Transfer in Multi-armed Bandit with Finite Set of Models [O] . Gheshlaghi Azar Mohammad, Lazaric Alessandro, Brunskill Emma 2013

机译：具有有限模型的多臂强队顺序传递

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

摘要

著录项

相似文献

相关主题

期刊订阅