...
首页> 外文期刊>Multimedia Tools and Applications >Egocentric visual scene description based on human-object interaction and deep spatial relations among objects
【24h】

Egocentric visual scene description based on human-object interaction and deep spatial relations among objects

机译:基于人对象交互的自我视觉场景描述和物体之间的深层空间关系

获取原文
获取原文并翻译 | 示例
           

摘要

Visual Scene interpretation is one of the major areas of research in the recent past. Recognition of human object interaction is a fundamental step towards understanding visual scenes. Videos can be described via a variety of human-object interaction scenarios such as when both human and object are static (static-static), one is static while other is dynamic (static-dynamic) and both are dynamic (dynamic-dynamic). This paper presents a unified framework for the explanation of these interactions between humans and a variety of objects using deep learning as a pivot methodology. Human-object interaction is extracted through native machine learning techniques, while spatial relations are captured by training a model through convolution neural network. We also address the recognition of human posture in detail to provide egocentric visual description. After extracting visual features, sequential minimal optimization is employed for training our model. Extracted inter-action, spatial relations and posture information are fed into natural language generation module along with interacting object label to generate scene understanding. Evaluation of the proposed framework is done for two state of the art datasets i.e., MSCOCO and MSR3D Daily activity dataset; where achieved results are 78 and 91.16% accurate, respectively.
机译:视觉场景解释是最近过去的主要研究领域之一。对人体对象互动的认识是了解视觉场景的基本步骤。视频可以通过各种人类对象交互方案描述,例如人类和对象都是静态(静态静态)时,一个是静态的,而其他是动态(静态动态),两者都是动态(动态动态)。本文提出了一个统一的框架,用于解释这些人类与各种物体之间的这些相互作用,使用深度学习作为枢轴方法。通过天然机器学习技术提取人对象交互,而通过卷积神经网络训练模型来捕获空间关系。我们还详细介绍了人类姿势的认可,以提供专业的视觉描述。在提取可视特征后,采用顺序最小优化来培训我们的模型。提取的帧间间,空间关系和姿势信息被送入自然语言生成模块以及交互对象标签以生成场景理解。对所提出的框架的评估是针对两个最先进的数据集的状态,Mscoco和MSR3D日常活动数据集完成;在达到的结果分别为78和91.16%的准确情况下。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号