Attentive Models in Vision: Computing Saliency Maps in the Deep Learning Era

机译：视觉中的注意力模型：深度学习时代中的计算显着性图

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Estimating the focus of attention of a person looking at an image or a video is a crucial step which can enhance many vision-based inference mechanisms: image segmentation and annotation, video captioning, autonomous driving are some examples. The early stages of the attentive behavior are typically bottom-up; reproducing the same mechanism means to find the saliency embodied in the images, i.e. which parts of an image pop out of a visual scene. This process has been studied for decades in neuroscience and in terms of computational models for reproducing the human cortical process. In the last few years, early models have been replaced by deep learning architectures, that outperform any early approach compared against public datasets. In this paper, we propose a discussion on why convolutional neural networks (CNNs) are so accurate in saliency prediction. We present our DL architectures which combine both bottom-up cues and higher-level semantics, and incorporate the concept of time in the attentional process through LSTM recurrent architectures. Eventually, we present a video-specific architecture based on the C3D network, which can extracts spatio-temporal features by means of 3D convolutions to model task-driven attentive behaviors. The merit of this work is to show how these deep networks are not mere brute-force methods tuned on massive amount of data, but represent well-defined architectures which recall very closely the early saliency models, although improved with the semantics learned by human ground-truth.

机译：估计观看图像或视频的人的注意力焦点是至关重要的步骤，可以增强许多基于视觉的推理机制：图像分割和注释，视频字幕，自动驾驶就是其中的一些示例。注意行为的早期通常是自下而上的。再现相同的机制意味着找到体现在图像中的显着性，即图像的哪些部分从视觉场景中弹出。这个过程已经在神经科学和用于复制人类皮层过程的计算模型方面进行了数十年的研究。在过去的几年中，早期的模型已被深度学习架构所取代，与公共数据集相比，该模型的性能优于任何早期方法。在本文中，我们提出了关于卷积神经网络（CNN）为什么在显着性预测中如此精确的讨论。我们介绍了结合了自下而上的提示和高级语义的DL体系结构，并通过LSTM循环体系结构将时间的概念纳入了注意过程中。最终，我们提出了一种基于C3D网络的特定于视频的体系结构，该体系结构可以通过3D卷积提取时空特征以对任务驱动的注意力行为进行建模。这项工作的优点是展示这些深度网络如何不仅是根据大量数据调整的蛮力方法，而且还代表了定义明确的体系结构，该体系结构可以非常紧密地回忆起早期的显着性模型，尽管通过人类的学习获得了语义上的改进-真相。

著录项

来源
《International conference of the Italian Association for Artificial Intelligence》|2017年|387-399|共13页
会议地点
作者
Marcella Cornia; Davide Abati; Lorenzo Baraldi; Andrea Palazzi; Simone Calderara; Rita Cucchiara;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Saliency; Human attention; Neuroscience; Vision; Deep learning;

机译：显着性人类的关注;神经科学视力;深度学习;

相似文献

外文文献
中文文献
专利

1. Attentive models in vision: Computing saliency maps in the deep learning era [J] . Marcella Cornia, Davide Abati, Lorenzo Baraldi, Intelligenza Artificiale . 2018,第2期

机译：视野中的细心模型：深度学习时代的计算显着图
2. DISC: Deep Image Saliency Computing via Progressive Representation Learning [J] . Tianshui Chen, Liang Lin, Lingbo Liu, Neural Networks and Learning Systems, IEEE Transactions on . 2016,第6期

机译：DISC：通过渐进式表示学习进行深度图像显着性计算
3. Synthesizing Supervision for Learning Deep Saliency Network without Human Annotation [J] . Zhang Dingwen, Han Junwei, Zhang Yu, IEEE Transactions on Pattern Analysis and Machine Intelligence . 2020,第7期

机译：没有人为注释学习深度显着性网络的监督
4. Attentive Models in Vision: Computing Saliency Maps in the Deep Learning Era [C] . Marcella Cornia, Davide Abati, Lorenzo Baraldi, International Conference of the Italian Association for Artificial Intelligence . 2017

机译：视野中的细心模型：深度学习时代的计算显着性图
5. Leveraging Model Flexibility and Deep Structure: Non-parametric and Deep Models for Computer Vision Processes with Applications to Deep Model Compression [D] . Rhodes, Anthony D. 2020

机译：利用模型灵活性和深度结构：计算机视觉过程的非参数和深模型，具有深入模型压缩的应用程序
6. Investigation of a Novel Deep Learning-Based Computed Tomography Perfusion Mapping Framework for Functional Lung Avoidance Radiotherapy [O] . Ge Ren, Sai-kit Lam, Jiang Zhang, 2021

机译：基于深度学习的基于深度学习的计算机断层扫描灌注框架的功能性肺避免放射疗法研究
7. Attentive models in vision: Computing saliency maps in the deep learning era [O] . Marcella Cornia, Davide Abati, Lorenzo Baraldi, 2019

机译：视野中的细心模型：深度学习时代的计算显着性图

Attentive Models in Vision: Computing Saliency Maps in the Deep Learning Era

摘要

著录项

相似文献

相关主题

期刊订阅