首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Learning Answer Embeddings for Visual Question Answering
【24h】

Learning Answer Embeddings for Visual Question Answering

机译:学习答案嵌入用于视觉问题的回答

获取原文

摘要

We propose a novel probabilistic model for visual question answering (Visual QA). The key idea is to infer two sets of embeddings: one for the image and the question jointly and the other for the answers. The learning objective is to learn the best parameterization of those embeddings such that the correct answer has higher likelihood among all possible answers. In contrast to several existing approaches of treating Visual QA as multi-way classification, the proposed approach takes the semantic relationships (as characterized by the embeddings) among answers into consideration, instead of viewing them as independent ordinal numbers. Thus, the learned embedded function can be used to embed unseen answers (in the training dataset). These properties make the approach particularly appealing for transfer learning for open-ended Visual QA, where the source dataset on which the model is learned has limited overlapping with the target dataset in the space of answers. We have also developed large-scale optimization techniques for applying the model to datasets with a large number of answers, where the challenge is to properly normalize the proposed probabilistic models. We validate our approach on several Visual QA datasets and investigate its utility for transferring models across datasets. The empirical results have shown that the approach performs well not only on in-domain learning but also on transfer learning.
机译:我们提出了一种新颖的视觉问题答案(Visual QA)的概率模型。关键的想法是推断两组嵌入式:一个用于图像和一个问题,并为答案而联合。学习目标是学习这些嵌入的最佳参数化,使得正确答案在所有可能的答案中具有更高的可能性。与对待视觉QA的几种现有方法相比,所提出的方法考虑了答案中的语义关系(如嵌入的特征),而不是将它们视为独立的序数。因此,所学习的嵌入功能可用于嵌入不间断的答案(在训练数据集中)。这些属性使得对开放式视觉QA的转移学习进行特别吸引力的方法,其中学习模型的源数据集与答案空间中的目标数据集有限。我们还开发了大规模优化技术,用于将模型应用于具有大量答案的数据集,其中挑战是正确归一化所提出的概率模型。我们在几个Visual QA数据集中验证了我们的方法,并调查其实用程序以将模型转移到数据集中。经验结果表明,该方法不仅在域名学习中表现良好,而且表现不仅要转移学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号