Multiple answers to a question: a new approach for visual question answering

Hosseinabad Sayedshayan Hashemi; Safayani Mehran; Mirzaei Abdolreza

首页> 外文期刊>The Visual Computer >Multiple answers to a question: a new approach for visual question answering

【24h】

Multiple answers to a question: a new approach for visual question answering

机译：问题的多个答案：一种新的视觉问题接听方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the advent of deep learning, multi-modal data have been of great interest. One of the multi-modal tasks which can be included in the computer vision domain is visual question answering (VQA). In VQA, a question and an image are entered into the model and the model tries to answer the question according to the image. To the best of our knowledge, the current techniques look at the image and only give one answer to the question asked. However, in some situations, there are several answers to the asked question. In this paper, we address this problem and define a new domain in the task of VQA as well as a new computationally efficient approach to cope with multiple-answer VQA. In this approach, we use a sliding window in an efficient manner to examine the answer to the question in different parts of the image. Due to the fact that so far no proper dataset is available for multiple-answer VQA, we provide a new dataset for evaluating our proposed model. The experiments express that our model uses 94% less operation than other models, making it very suitable for real-time applications.

机译：随着深度学习的出现，多模态数据具有很大的兴趣。可以包含在计算机视觉域中的多模态任务之一是视觉问题应答（VQA）。在VQA中，将一个问题和图像输入到模型中，模型试图根据图像回答问题。据我们所知，目前的技术看着图像，只会给出一个问题。但是，在某些情况下，有几个问题的答案。在本文中，我们解决了这个问题，并在VQA的任务中定义了一个新域，以及一种新的计算有效的方法来应对多答案VQA。在这种方法中，我们以有效的方式使用滑动窗口来检查图像的不同部分中的问题的答案。由于到目前为止，没有适当的数据集可用于多答复VQA，我们提供了一个用于评估我们提出的模型的新数据集。实验表明，我们的模型使用比其他型号更少的操作少94％，使其非常适合实时应用。

著录项

来源
《The Visual Computer》 |2021年第1期|119-131|共13页
作者
Hosseinabad Sayedshayan Hashemi; Safayani Mehran; Mirzaei Abdolreza;
展开▼
作者单位

Isfahan Univ Technol Dept Elect & Comp Engn Esfahan 8415683111 Iran;

Isfahan Univ Technol Dept Elect & Comp Engn Esfahan 8415683111 Iran;

Isfahan Univ Technol Dept Elect & Comp Engn Esfahan 8415683111 Iran;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Visual question answering; Deep learning; Convolution neural network; Multiple answers; Recurrent neural network;

机译：视觉问题应答;深度学习;卷积神经网络;多答案;经常性神经网络;

相似文献

外文文献
中文文献
专利

1. BETTER GENERIC OBJECTS COUNTING WHEN ASKING QUESTIONS TO IMAGES: A MULTITASK APPROACH FOR REMOTE SENSING VISUAL QUESTION ANSWERING [J] . S. Lobry, D. Marcos, B. Kellenberger, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences . 2020,第5期

机译：在向图像提出问题时计算更好的通用对象：遥感视觉问题的多任务方法
2. A New Approach for Finding and Ranking Question – Answering Pairs in Community Question Answering [J] . Van-Tu Nguyen Indian Journal of Science and Technology . 2018,第2期

机译：查找和排序问题的新方法-社区问答中的答案对
3. Question-aware prediction with candidate answer recommendation for visual question answering [J] . B. Kim, J. Kim Electronics Letters . 2017,第18期

机译：带有候选答案推荐的问题感知预测，用于视觉问答
4. Multiple Interaction Learning with Question-Type Prior Knowledge for Constraining Answer Search Space in Visual Question Answering [C] . Tuong Do, Binh X. Nguyen, Huy Tran, European conference on computer vision . 2020

机译：用问题类型的多次交互学习在视觉问题应答中约束答案搜索空间的先前知识
5. Representation Learning of Data with Multiple Modalities with Applications to Visual Question Answering [D] . Ilievski, Ilija. 2018

机译：表示具有多种模式的数据的学习，以应用程序到视觉问题应答
6. A question-entailment approach to question answering [O] . Asma Ben Abacha, Dina Demner-Fushman 2019

机译：问答式问题解答方法
7. Multiple Interaction Learning with Question-Type Prior Knowledge for Constraining Answer Search Space in Visual Question Answering [O] . Tuong Do, Binh X. Nguyen, Huy Tran, 2020

机译：具有问题类型的多个交互学习，用于在视觉问题应答中约束答案搜索空间的相关知识
8. Questions and Answers on Quality, the ISO 9000 Standard Series, Quality SystemRegistration, and Related Issues. More Questions and Answers on the ISO 9000 Standard Series and Related Issues [R] . Breitenberg, M. 1993

机译：有关质量的问题和解答，IsO 9000标准系列，质量体系注册和相关问题。有关IsO 9000标准系列及相关问题的更多问题和解答

Multiple answers to a question: a new approach for visual question answering

摘要

著录项

相似文献

相关主题

期刊订阅