首页> 外文会议>AAAI Conference on Artificial Intelligence >Predictive Off-Policy Policy Evaluation for Nonstationary Decision Problems, with Applications to Digital Marketing

【24h】

Predictive Off-Policy Policy Evaluation for Nonstationary Decision Problems, with Applications to Digital Marketing

机译：在数字营销的应用中，预测非视野决策问题的脱离政策政策评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we consider the problem of evaluating one digital marketing policy (or more generally, a policy for an MDP with unknown transition and reward functions) using data collected from the execution of a different policy. We call this problem off-policy policy evaluation. Existing methods for off-policy policy evaluation assume that the transition and reward functions of the MDP are stationary - an assumption that is typically false, particularly for digital marketing applications. This means that existing off-policy policy evaluation methods are reactive to nonstationarity, in that they slowly correct for changes after they occur. We argue that off-policy policy evaluation for nonstationary MDPs can be phrased as a time series prediction problem, which results in predictive methods that can anticipate changes before they happen. We therefore propose a synthesis of existing off-policy policy evaluation methods with existing time series prediction methods, which we show results in a drastic reduction of mean squared error when evaluating policies using real digital marketing data set.

机译：在本文中，我们考虑使用从执行不同策略的执行中收集的数据来评估一个数字营销政策（或更一般地，具有未知转换和奖励函数的MDP策略）的问题。我们称此问题违规政策评估。违规策略评估的现有方法假设MDP的转换和奖励功能是静止的 - 一种通常是假的假设，特别是对于数字营销应用程序。这意味着现有的脱助政策策略评估方法对非间手性具有反应性，因为它们在发生后慢慢纠正变化。我们认为，非间断MDP的违规政策评估可以作为时间序列预测问题被扣除，这导致预测方法可以预测在它们发生之前的变化。因此，我们提出了具有现有时间序列预测方法的现有违规政策评估方法的合成，我们显示在使用真实数字营销数据集评估策略时，我们展示了平均平方误差的急剧减少。

著录项

来源
《AAAI Conference on Artificial Intelligence》|2017年|4348-5137p|共6页
会议地点
作者
Philip S. Thomas; Georgios Theocharous; Mohammad Ghavamzadeh; Ishan Durugkar; Emma Brunskill;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes [J] . Nathan Kallus, Masatoshi Uehara Journal of machine learning research . 2020,第a期

机译：马尔可夫决策过程有效截止政策评估的双重加固学习
2. Marketing mix (7P) and performance assessment of Western fast food industry in Taiwan: An application by associating DEMATEL (Decision Making Trial and Evaluation Laboratory) and ANP (Analytic Network Process) [J] . Su-Mei Lin African Journal of Business Management . 2011,第26期

机译：台湾西方快餐业的营销组合（7P）和绩效评估：与DEMATEL（决策试验和评估实验室）和ANP（分析网络过程）相关联的应用
3. Long-term policy evaluation: Application of a new robust decision framework for Iran's energy exports security [J] . Mohammad Alipour, Reza Hafezi, Bilal Ervural, Energy . 2018,第AUGa15期

机译：长期政策评估：将新的强有力的决策框架应用于伊朗的能源出口安全
4. Predictive Off-Policy Policy Evaluation for Nonstationary Decision Problems, with Applications to Digital Marketing [C] . Philip S. Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, AAAI Conference on Artificial Intelligence . 2017

机译：在数字营销的应用中，预测非视野决策问题的脱离政策政策评估
5. Machine Learning for Decision Making: Applications to Off-Policy Learning and Combinatorial Optimization [D] . Lu, Hao. 2021

机译：机器学习决策：禁止禁止学习和组合优化的应用
6. Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots [O] . Federico Pagnozzi, Mauro Birattari 2021

机译：对机器人群体性能的违规评估：重要的采样以评估对控制机器人的有限状态机的潜在修改
7. The Personal Selling Digital Marketing on Purchase Decision of Insurance Policy [O] . Kurnia Kurnia, Rani Rifani 2020

机译：个人销售和数字营销保险单的购买决定

Predictive Off-Policy Policy Evaluation for Nonstationary Decision Problems, with Applications to Digital Marketing

摘要

著录项

相似文献

相关主题

期刊订阅