Jointly Learning to See, Ask, and Guess

机译：共同学习看，问和猜

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In our daily use of natural language, we constantly profit of our strong reasoning skills to interpret utterances we hear or read. At times we exploit implicit associations we have learned between words or between events, at others we explicitly think about a problem and follow the reasoning steps carefully and slowly. We could say that the latter are the realm of logical approaches based on symbolic representations, whereas the former are better modelled by statistical models, like Neural Networks (NNs), based on continuous representations. My talk will focus on how NNs can learn to be engaged in a conversation on visual content. Specifically, I will present our work on Visual Dialogue (VD) taking as example two tasks oriented VDs, GuessWhat?! [2] and GuessWhich [1]. In these tasks, two NN agents interact to each other so that one of the two (the Questioner), by asking questions to the other (the Answerer), can guess which object the Answerer has in mind among all the entities in a given image (GuessWhat?!) or which image the Answerer sees among several ones seen by the Questioner at end of the dialogue (GuessWhich). I will present our Questioner model: it encodes both visual and textual inputs, produces a multimodal representation, generates natural language questions, understands the Answerer's responses and guesses the object/image. I will show how training the NN agent's modules (Question generator and Guesser) jointly and cooperatively help the model performance and increase the quality of the dialogues. In particular, I will compare our model's dialogues with those of VD models which exploit much more complex learning paradigms, like Reinforcement Learning, showing that more complex machine learning methods do not necessarily correspond to better dialogue quality or even better quantitative performance. The talk is based on [3] and other work available at https://vista-unitn-uva.github.io/.

机译：在日常使用的自然语言中，我们不断受益于强大的推理能力来解释我们听到或听到的话语。有时，我们利用在单词之间或事件之间学习到的隐式关联，在其他情况下，我们明确地思考问题并仔细，缓慢地遵循推理步骤。我们可以说后者是基于符号表示的逻辑方法的领域，而前者则可以通过基于连续表示的统计模型（如神经网络，NN）更好地建模。我的演讲将重点讨论NN如何学习参与视觉内容的对话。具体来说，我将以视觉对话（VD）为例介绍两个面向任务的VD，GuessWhat ?！ [2]和GuessWhich [1]。在这些任务中，两个NN代理相互交互，以便通过向另一个（Answerer）提问，两个（Amitter）中的一个可以猜测给定图像中所有实体中Answerer想到的是哪个对象。（猜猜是什么？！）或在对话结束时，发问者看到的几幅图像（猜猜是哪个）。我将介绍我们的Questioner模型：它对视觉和文本输入进行编码，生成多模式表示，生成自然语言问题，了解Answerer的响应并猜测对象/图像。我将展示如何训练NN代理程序的模块（问题生成器和Guesser）来联合和协作地帮助模型性能和提高对话质量。特别是，我将把我们的模型的对话与利用更复杂的学习范例（例如强化学习）的VD模型的对话进行比较，这表明更复杂的机器学习方法不一定对应于更好的对话质量甚至更好的量化性能。该演讲基于[3]和其他可在https://vista-unitn-uva.github.io/上找到的工作。

著录项

来源
《International Workshop on Logic, Language, Information and Computation》|2019年|a3-a3|共1页
会议地点 Utrecht(NL)
作者
Raffaella Bernardi; University of Trento;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Joint source-channel coding and guessing with application to sequential decoding [J] . Arikan E., Merhav N. IEEE Transactions on Information Theory . 1998,第5期

机译：联合源信道编码和猜测及其在顺序解码中的应用
2. Machine Learning Aided Key-Guessing Attack Paradigm Against Logic Block Encryption [J] . Yi Zhong, Jian-Hua Feng, Xiao-Xin Cui, 计算机科学技术学报（英文版） . 2021,第005期

机译：机器学习辅助键猜测范例对逻辑块加密
3. GENPass: A Multi-Source Deep Learning Model for Password Guessing [J] . Xia Zhiyang, Yi Ping, Liu Yunyu, IEEE transactions on multimedia . 2020,第5期

机译：Genpass：密码猜测的多源深度学习模型
4. Beyond Task Success: A Closer Look at Jointly Learning to See, Ask, and GuessWhat [C] . Ravi Shekhai, Aashish Venkatesh, Tim Baumgaertner, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2019

机译：超越任务的成功：近距离观察共同学习观察，询问和猜测的内容
5. Second-guessing and self-monitoring: Monitoring the need for accurate information in second-guessing. [D] . Numainville, Brian Edward. 1993

机译：二次猜测和自我监视：在二次猜测中监视对准确信息的需求。
6. Optimal guessing in ‘Guess Who’ [O] . Ben O’Neill 2021

机译：最佳猜测猜猜谁
7. The Effectiveness of Guessing Word Learning Model Type towards the Elementary Students’ Natural Science Learning Outcomes [O] . Siti Aisyiyah, Ay. Soegeng, Filia Prima 2020

机译：猜测词学习模型类型对初学生自然科学学习成果的有效性
8. Guess what. Here is a new tool that finds some new guessing attacks [R] . Corin, R. , Malladi, S. , Alves-Foss, J. , 2003

机译：你猜怎么着。这是一个新的工具，可以找到一些新的猜测攻击

Jointly Learning to See, Ask, and Guess

摘要

著录项

相似文献

相关主题

期刊订阅