首页> 外文会议>IEEE International Conference on Robotics and Automation >Geometric Pretraining for Monocular Depth Estimation
【24h】

Geometric Pretraining for Monocular Depth Estimation

机译:用于单眼深度估计的几何预训练

获取原文

摘要

ImageNet-pretrained networks have been widely used in transfer learning for monocular depth estimation. These pretrained networks are trained with classification losses for which only semantic information is exploited while spatial information is ignored. However, both semantic and spatial information is important for per-pixel depth estimation. In this paper, we design a novel self-supervised geometric pretraining task that is tailored for monocular depth estimation using uncalibrated videos. The designed task decouples the structure information from input videos by a simple yet effective conditional autoencoder-decoder structure. Using almost unlimited videos from the internet, networks are pretrained to capture a variety of structures of the scene and can be easily transferred to depth estimation tasks using calibrated images. Extensive experiments are used to demonstrate that the proposed geometric-pretrained networks perform better than ImageNet-pretrained networks in terms of accuracy, few-shot learning and generalization ability. Using existing learning methods, geometric-transferred networks achieve new state-of-the-art results by a large margin. The pretrained networks will be open source soon1 .
机译:ImageNet预训练网络已广泛用于单眼深度估计的传递学习中。这些经过预训练的网络使用分类损失进行训练,对于分类损失,仅利用语义信息,而忽略空间信息。但是,语义和空间信息对于每个像素的深度估计都很重要。在本文中,我们设计了一种新颖的自我监督式几何预训练任务,该任务专为使用未经校准的视频进行单眼深度估计而量身定制。设计的任务通过简单而有效的条件自动编码器-解码器结构将结构信息与输入视频解耦。使用来自互联网的几乎无限的视频,可以对网络进行预训练,以捕获场景的各种结构,并可以使用校准后的图像轻松地将其转移到深度估计任务中。大量实验被用来证明所提出的几何预训练网络在准确性,少拍学习和泛化能力方面比ImageNet预训练网络表现更好。使用现有的学习方法,几何转移网络可以在很大程度上获得新的最新结果。预训练的网络将很快开源 1

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号