...
首页> 外文期刊>The Journal of Artificial Intelligence Research >Gesture Salience as a Hidden Variable for Coreference Resolution and Keyframe Extraction
【24h】

Gesture Salience as a Hidden Variable for Coreference Resolution and Keyframe Extraction

机译:手势显着性作为共参考分辨率和关键帧提取的隐藏变量

获取原文
获取原文并翻译 | 示例
           

摘要

Gesture is a non-verbal modality that can contribute crucial information to the understanding of natural language. But not all gestures are informative, and non-communicative hand motions may confuse natural language processing (NLP) and impede learning. People have little difficulty ignoring irrelevant hand movements and focusing on meaningful gestures, suggesting that an automatic system could also be trained to perform this task. However, the informativeness of a gesture is context-dependent and labeling enough data to cover all cases would be expensive. We present conditional modality fusion, a conditional hidden-variable model that learns to predict which gestures are salient for coreference resolution, the task of determining whether two noun phrases refer to the same semantic entity. Moreover, our approach uses only coreference annotations, and not annotations of gesture salience itself. We show that gesture features improve performance on coreference resolution, and that by attending only to gestures that are salient, our method achieves further significant gains. In addition, we show that the model of gesture salience learned in the context of coreference accords with human intuition, by demonstrating that gestures judged to be salient by our model can be used successfully to create multimedia keyframe summaries of video. These summaries are similar to those created by human raters, and significantly outperform summaries produced by baselines from the literature.
机译:手势是一种非语言形式,可以为理解自然语言提供重要信息。但是,并非所有手势都具有信息意义,非交流手势可能会使自然语言处理(NLP)混乱并阻碍学习。人们毫不费力地忽略无关的手势并专注于有意义的手势,这表明还可以训练一个自动系统来执行此任务。但是,手势的信息性取决于上下文,并且标记足够的数据以涵盖所有情况将很昂贵。我们提出了条件模态融合,这是一个条件式隐变量模型,该模型学习预测哪些手势对于共指分解是显着的,确定两个名词短语是否引用同一语义实体的任务。此外,我们的方法仅使用共指注释,而不使用手势显着性本身的注释。我们证明了手势功能可以提高共指分辨率的性能,并且通过仅关注显着手势,我们的方法可以取得更大的收益。此外,我们证明了在共指情况下学习的手势显着性模型符合人类的直觉,这表明通过我们的模型判断为显着的手势可以成功用于创建视频的多媒体关键帧摘要。这些摘要与人类评分者创建的摘要相似,并且明显优于文献中的基准所产生的摘要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号