...
首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Object Detection from Scratch with Deep Supervision
【24h】

Object Detection from Scratch with Deep Supervision

机译:通过深度监控从零开始进行对象检测

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we propose Deeply Supervised Object Detectors (DSOD), an object detection framework that can be trained from scratch. Recent advances in object detection heavily depend on the off-the-shelf models pre-trained on large-scale classification datasets like ImageNet and OpenImage. However, one problem is that adopting pre-trained models from classification to detection task may incur learning bias due to the different objective function and diverse distributions of object categories. Techniques like fine-tuning on detection task could alleviate this issue to some extent but are still not fundamental. Furthermore, transferring these pre-trained models across discrepant domains will be more difficult (e.g., from RGB to depth images). Thus, a better solution to handle these critical problems is to train object detectors from scratch, which motivates our proposed method. Previous efforts on this direction mainly failed by reasons of the limited training data and naive backbone network structures for object detection. In DSOD, we contribute a set of design principles for learning object detectors from scratch. One of the key principles is the deep supervision, enabled by layer-wise dense connections in both backbone networks and prediction layers, plays a critical role in learning good detectors from scratch. After involving several other principles, we build our DSOD based on the single-shot detection framework (SSD). We evaluate our method on PASCAL VOC 2007, 2012 and COCO datasets. DSOD achieves consistently better results than the state-of-the-art methods with much more compact models. Specifically, DSOD outperforms baseline method SSD on all three benchmarks, while requiring only 1/2 parameters. We also observe that DSOD can achieve comparable/slightly better results than Mask RCNN [1] + FPN [2] (under similar input size) with only 1/3 parameters, using no extra data or pre-trained models.
机译:在本文中,我们提出了深度监督对象检测器(DSOD),这是一种可以从头开始进行训练的对象检测框架。对象检测的最新进展在很大程度上取决于对大型分类数据集(如ImageNet和OpenImage)进行预训练的现成模型。然而,一个问题是,由于目标函数的不同和对象类别的不同分布,采用从分类到检测任务的预训练模型可能会产生学习偏差。诸如对检测任务进行微调之类的技术可以在某种程度上缓解此问题,但仍不是根本。此外,跨差异域传输这些预训练的模型将更加困难(例如,从RGB到深度图像)。因此,解决这些关键问题的更好解决方案是从头开始训练目标检测器,这激发了我们提出的方法。先前在该方向上的努力主要由于训练数据有限和用于对象检测的幼稚骨干网结构而失败。在DSOD中,我们为从头学习物体检测器贡献了一套设计原则。关键原则之一是深度监管,这是由骨干网和预测层中的分层密集连接实现的,它在从头开始学习好的检测器中起着关键作用。在涉及其他几个原则之后,我们基于单发检测框架(SSD)构建我们的DSOD。我们在PASCAL VOC 2007、2012和COCO数据集上评估我们的方法。与最先进的方法相比,与更紧凑的模型相比,DSOD始终能够获得更好的结果。具体来说,DSOD在所有三个基准测试中均优于基准方法SSD,而仅需1/2个参数。我们还观察到,仅使用1/3个参数,而无需使用额外的数据或未经过预先训练的模型,DSOD可以比Mask RCNN [1] + FPN [2](在类似的输入大小下)获得可比较/略好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号