首页> 外文会议>International Joint Conference on Artificial Intelligence >Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling
【24h】

Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling

机译:知识渊博的故事讲述者:一种用于视觉讲故事的型号驱动的生成模型

获取原文

摘要

The visual storytelling (VST) task aims at generating a reasonable and coherent paragraph-level story with the image stream as input. Different from caption that is a direct and literal description of image content, the story in the VST task tends to contain plenty of imaginary concepts that do not appear in the image. This requires the AI agent to reason and associate with the imaginary concepts based on implicit commonsense knowledge to generate a reasonable story describing the image stream. Therefore, in this work, we present a commonsense-driven generative model, which aims to introduce crucial commonsense from the external knowledge base for visual storytelling. Our approach first extracts a set of candidate knowledge graphs from the knowledge base. Then, an elaborately designed vision-aware directional encoding schema is adopted to effectively integrate the most informative commonsense. Besides, we strive to maximize the semantic similarity within the output during decoding to enhance the coherence of the generated text. Results show that our approach can outperform the state-of-the-art systems by a large margin, which achieves a 29% relative improvement of CIDEr score. With additional commonsense and semantic-relevance based objective, the generated stories are more diverse and coherent.
机译:视觉讲故事(VST)任务旨在使用作为输入的图像流生成合理和连贯的段落级别故事。不同于标题,即图像内容的直接和文字描述,VST任务中的故事往往包含大量的虚拟概念,这些概念不会出现在图像中。这要求AI代理基于隐式的致辞知识来与虚构的概念相关联,以生成描述图像流的合理故事。因此,在这项工作中,我们提出了一种致辞驱动的生成模型,旨在从外部知识库引入视觉讲故事的关键型号。我们的方法首先从知识库中提取一组候选知识图。然后,采用精心设计的视觉感知的方向编码模式,以有效地整合了最具信息性的致辞。此外,我们努力在解码期间最大化输出内的语义相似度,以增强所生成的文本的相干性。结果表明,我们的方法可以通过大幅度优于最先进的系统,这实现了苹果酒评分的相对改善了29%。通过额外的致辞和语义相关的目标,所产生的故事更多样化和连贯。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号