Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models

机译：重新思考具有潜在变量模型的端到端对话框代理中的强化学习动作空间

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Defining action spaces for conversational agents and optimizing their decision-making process with reinforcement learning is an enduring challenge. Common practice has been to use handcrafted dialog acts, or the output vocabulary, e.g. in neural encoder decoders, as the action spaces. Both have their own limitations. This paper proposes a novel latent action framework that treats the action spaces of an end-to-end dialog agent as latent variables and develops unsupervised methods in order to induce its own action space from the data. Comprehensive experiments are conducted examining both continuous and discrete action types and two different optimization methods based on stochastic variational inference. Results show that the proposed latent actions achieve superior empirical performance improvement over previous word-level policy gradient methods on both DealOrNoDeal and MultiWoz dialogs. Our detailed analysis also provides insights about various latent variable approaches for policy learning and can serve as a foundation for developing better latent actions in future research.

机译：定义对话式行动者的行动空间并通过强化学习来优化其决策过程是一项长期的挑战。常见的做法是使用手工制作的对话动作或输出词汇，例如在神经编码器的解码器中，作为动作空间。两者都有其自身的局限性。本文提出了一种新颖的潜在行动框架，该框架将端到端对话代理的行动空间视为潜在变量，并开发了无监督的方法，以便从数据中得出自己的行动空间。进行了综合实验，研究了连续和离散动作类型以及基于随机变分推断的两种不同的优化方法。结果表明，在DealOrNoDeal和MultiWoz对话框上，所提出的潜在动作都比以前的单词级策略梯度方法具有更好的经验性能改进。我们的详细分析还提供了有关各种潜在的可变变量策略学习的见解，并可以作为在将来的研究中开发更好的潜在行为的基础。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2019年|1208-1218|共11页
会议地点
作者
Tiancheng Zhao; Kaige Xie; Maxine Eskenazi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Markov-game modeling of cyclist-pedestrian interactions in shared spaces: A multi-agent adversarial inverse reinforcement learning approach [J] . Alsaleh Rushdi, Sayed Tarek Transportation research . 2021,第Jula期

机译：广播空间中骑自行车者行人互动的马尔可夫 - 游戏模型
2. Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces [J] . Gellért Weisz, Paweł Budzianowski, Pei-Hao Su, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018,第11期

机译：具有大动作空间的对话系统的示例高效深度强化学习
3. Ubiquitous Distributed Deep Reinforcement Learning at the Edge: Analyzing Byzantine Agents in Discrete Action Spaces [J] . Wenshuai Zhao, Jorge Pe?a Queralta, Li Qingqing, Procedia Computer Science . 2020,第5期

机译：边缘无处不在的分布式深度增强学习：在离散动作空间中分析拜占庭工
4. Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models [C] . Tiancheng Zhao, Kaige Xie, Maxine Eskenazi Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2019

机译：重新思考行动空间，用于潜在变量模型的端到端对话代理中的加强学习
5. Integrating complexity science and artificial intelligence: GIS, agents and reinforcement learning for modeling forest cover change. [D] . Bone, Christopher. 2009

机译：集成复杂性科学和人工智能：GIS，代理和强化学习，用于对森林覆盖率变化进行建模。
6. Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values [O] . Samuel J. Gershman, Bijan Pesaran, Nathaniel D. Daw 2009

机译：人类强化学习通过学习效应子特定值来细分结构化的动作空间
7. Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models [O] . Tiancheng Zhao, Kaige Xie, Maxine Eskenazi 2019

机译：重新思考行动空间，用于潜在变量模型的端到端对话代理中的加强学习

Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models

摘要

著录项

相似文献

相关主题

期刊订阅