...
首页> 外文期刊>ACM transactions on multimedia computing communications and applications >A Benchmark Dataset and Comparison Study for Multi-modal Human Action Analytics
【24h】

A Benchmark Dataset and Comparison Study for Multi-modal Human Action Analytics

机译:多模态人体行动分析的基准数据集和比较研究

获取原文
获取原文并翻译 | 示例
           

摘要

Large-scale benchmarks provide a solid foundation for the development of action analytics. Most of the previous activity benchmarks focus on analyzing actions in RGB videos. There is a lack of large-scale and high-quality benchmarks for multi-modal action analytics. In this article, we introduce PKU Multi-Modal Dataset (PKU-MMD), a new large-scale benchmark for multi-modal human action analytics. It consists of about 28,000 action instances and 6.2 million frames in total and provides high-quality multi-modal data sources, including RGB, depth, infrared radiation (IR), and skeletons. To make PKU-MMD more practical, our dataset comprises two subsets under different settings for action understanding, namely Part Ⅰ and Part Ⅱ. Part Ⅰ contains 1,076 untrimmed video sequences with 51 action classes performed by 66 subjects, while Part Ⅱ contains 1,009 untrimmed video sequences with 41 action classes performed by 13 subjects. Compared to Part Ⅰ, Part Ⅱ is more challenging due to short action intervals, concurrent actions and heavy occlusion. PKU-MMD can be leveraged in two scenarios: action recognition with trimmed video clips and action detection with untrimmed video sequences. For each scenario, we provide benchmark performance on both subsets by conducting different methods with different modalities under two evaluation protocols, respectively. Experimental results show that PKU-MMD is a significant challenge to many state-of-the-art methods. We further illustrate that the features learned on PKU-MMD can be well transferred to other datasets. We believe this large-scale dataset will boost the research in the field of action analytics for the community.
机译:大型基准为行动分析的发展提供了坚实的基础。大多数以前的活动基准测试专注于分析RGB视频中的操作。对于多模态动作分析缺乏大规模和高质量的基准。在本文中,我们介绍了PKU多模态数据集(PKU-MMD),这是一种用于多模态人类行动分析的新型大规模基准。它由大约28,000个动作实例和620万帧组成,并提供高质量的多模态数据源,包括RGB,深度,红外辐射(IR)和骨架。为了使PKU-MMD更加实用,我们的数据集包括两个在不同设置的两个子集进行操作理解,即第Ⅰ部分和第Ⅱ部分。第Ⅰ部分含有1,076个未经监测的视频序列,其中66个受试者进行了51个动作类,而第Ⅱ部分包含1,009个未经监测的视频序列,其中13个受试者进行了41个动作类。与第Ⅰ部分相比,第Ⅱ部分由于短暂的行为间隔,并发动作和重闭塞而更具挑战性。 PKU-MMD可以在两种情况下利用:使用Untrimmed视频序列的修剪视频剪辑和动作检测进行动作识别。对于每个场景,我们通过分别在两个评估协议下进行不同方式进行不同的方法,为两个子集提供基准性能。实验结果表明,PKU-MMD对许多最先进的方法是一个重大挑战。我们进一步说明了PKU-MMD上学习的特征可以很好地传送到其他数据集。我们认为,这个大型数据集将提高社区行动分析领域的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号