首页> 外文会议>European conference on computer vision >Automated Textual Descriptions for a Wide Range of Video Events with 48 Human Actions
【24h】

Automated Textual Descriptions for a Wide Range of Video Events with 48 Human Actions

机译:带有48种人为动作的各种视频事件的自动文本描述

获取原文

摘要

Presented is a hybrid method to generate textual descriptions of video based on actions. The method includes an action classifier and a description generator. The aim for the action classifier is to detect and classify the actions in the video, such that they can be used as verbs for the description generator. The aim of the description generator is (1) to find the actors (objects or persons) in the video and connect these correctly to the verbs, such that these represent the subject, and direct and indirect objects, and (2) to generate a sentence based on the verb, subject, and direct and indirect objects. The novelty of our method is that we exploit the discriminative power of a bag-of-features action detector with the generative power of a rule-based action descriptor. Shown is that this approach outperforms a homogeneous setup with the rule-based action detector and action descriptor.
机译:提出了一种基于动作生成视频文本描述的混合方法。该方法包括动作分类器和描述生成器。动作分类器的目的是对视频中的动作进行检测和分类,以使它们可用作描述生成器的动词。描述产生器的目的是(1)在视频中找到演员(物体或人物)并将其正确连接到动词,以使它们代表主语,直接和间接宾语;以及(2)产生一个基于动词,主语以及直接和间接宾语的句子。我们方法的新颖性在于,我们利用特征包动作检测器的判别能力与基于规则的动作描述符的产生能力。结果表明,该方法优于基于规则的动作检测器和动作描述符的同类设置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号