首页> 外文会议>IEEE International Conference on Image Processing >FPHA-Afford: A Domain-Specific Benchmark Dataset for Occluded Object Affordance Estimation in Human-Object-Robot Interaction
【24h】

FPHA-Afford: A Domain-Specific Benchmark Dataset for Occluded Object Affordance Estimation in Human-Object-Robot Interaction

机译:FPHA-Afford:用于人与对象-机器人交互中的被遮挡对象承受能力估计的特定领域基准数据集

获取原文

摘要

In human-object-robot interactions, the recent explosion of standard datasets has offered promising opportunities for deep learning techniques in understanding the functionalities of object parts. But most of existing datasets are only suitable for the applications where objects are non-occluded or isolated during interaction while occlusion is a common challenge in practical object affordance estimation task. In this paper, we attempt to address this issue by introducing a new benchmark dataset named FPHA-Afford that is built upon the popular dataset FPHA. In FPHA-Afford, we adopt egocentric-view to pre-process the videos from FPHA and select part of the frames that contain objects under the strong occlusion of hand. To transfer the domain of FPHA into object affordance estimation task, all of the frames are re-annotated with pixel-level affordance masks. In total, our FPHA-Afford collects 61 videos containing 4.3K frames with 6.55K annotated affordance masks belonging to 9 classes. Some of state-of-the-art semantic segmentation architectures are explored and evaluated over FPHA-Afford. We believe the scale, diversity and novelty of our FPHA-Afford could offer great opportunities to researchers in the computer vision community and beyond. Our dataset and experiment code will be made publicly available on https://github.com/Hussainflr/FPHA-Afford
机译:在人对机器人的交互中,最近标准数据集的爆炸式增长为深度学习技术理解对象零件的功能提供了有希望的机会。但是大多数现有数据集仅适用于在交互过程中对象不被遮挡或隔离的应用,而遮挡是实际对象提供能力估算任务中的常见挑战。在本文中,我们尝试通过引入一个基于流行数据集FPHA的新基准数据集FPHA-Afford来解决此问题。在FPHA-Afford中,我们采用以自我为中心的视图对FPHA中的视频进行预处理,并在强烈的手部遮挡下选择包含对象的部分帧。为了将FPHA的域转移到对象可承受性估计任务中,所有帧都使用像素级可承受性掩码进行重新注释。我们的FPHA-Afford总共收集了61个视频,这些视频包含4.3K帧和6.55K带注释的收费口罩,属于9类。通过FPHA-Afford探索和评估了一些最新的语义分割架构。我们相信FPHA-Afford的规模,多样性和新颖性可以为计算机视觉界及其他领域的研究人员提供巨大的机会。我们的数据集和实验代码将在https://github.com/Hussainflr/FPHA-Afford上公开提供

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号