...
首页> 外文期刊>International Journal of Computer Vision >Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning
【24h】

Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning

机译:从声音学习视线:环境声音为视觉学习提供监督

获取原文
获取原文并翻译 | 示例
           

摘要

The sound of crashing waves, the roar of fast-moving cars—sound conveys important information about the objects in our surroundings. In this work, we show that ambient sounds can be used as a supervisory signal for learning visual models. To demonstrate this, we train a convolutional neural network to predict a statistical summary of the sound associated with a video frame. We show that, through this process, the network learns a representation that conveys information about objects and scenes. We evaluate this representation on several recognition tasks, finding that its performance is comparable to that of other state-of-the-art unsupervised learning methods. Finally, we show through visualizations that the network learns units that are selective to objects that are often associated with characteristic sounds. This paper extends an earlier conference paper, Owens et al. (in: European conference on computer vision, 2016b), with additional experiments and discussion.
机译:汹涌澎湃的声音,快速移动的汽车的咆哮声 - 声音传达了关于我们周围环境中对象的重要信息。 在这项工作中,我们表明环境声音可以用作学习视觉模型的监控信号。 为了证明这一点,我们训练卷积神经网络,以预测与视频帧相关联的声音的统计摘要。 我们表明,通过此过程,网络了解表示对象和场景的信息的表示。 我们评估了若干识别任务的这种代表性,发现其性能与其他最先进的无监督学习方法相当。 最后,我们通过可视化来展示网络学习选择性地与通常与特征声音相关联的对象的单位。 本文延伸了早期的会议纸,欧文斯等人。 (in:欧洲电脑愿景会议,2016b),具有额外的实验和讨论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号