首页> 外文期刊>Image Processing, IEEE Transactions on >Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study
【24h】

Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study

机译:视觉显着性建模中人类模型一致性的定量分析:一个比较研究

获取原文
获取原文并翻译 | 示例
           

摘要

Visual attention is a process that enables biological and machine vision systems to select the most relevant regions from a scene. Relevance is determined by two components: 1) top-down factors driven by task and 2) bottom-up factors that highlight image regions that are different from their surroundings. The latter are often referred to as “visual saliency.” Modeling bottom-up visual saliency has been the subject of numerous research efforts during the past 20 years, with many successful applications in computer vision and robotics. Available models have been tested with different datasets (e.g., synthetic psychological search arrays, natural images or videos) using different evaluation scores (e.g., search slopes, comparison to human eye tracking) and parameter settings. This has made direct comparison of models difficult. Here, we perform an exhaustive comparison of 35 state-of-the-art saliency models over 54 challenging synthetic patterns, three natural image datasets, and two video datasets, using three evaluation scores. We find that although model rankings vary, some models consistently perform better. Analysis of datasets reveals that existing datasets are highly center-biased, which influences some of the evaluation scores. Computational complexity analysis shows that some models are very fast, yet yield competitive eye movement prediction accuracy. Different models often have common easy/difficult stimuli. Furthermore, several concerns in visual saliency modeling, eye movement datasets, and evaluation scores are discussed and insights for future work are provided. Our study allows one to assess the state-of-the-art, helps to organizing this rapidly growing field, and sets a unified comparison framework for gauging future efforts, similar to the PASCAL VOC challenge in the object recognition and detection domains.
机译:视觉注意力是使生物和机器视觉系统能够从场景中选择最相关区域的过程。相关性由两个部分确定:1)由任务驱动的自上而下的因素,以及2)突出显示与其周围环境不同的图像区域的自下而上的因素。后者通常被称为“视觉显着性”。在过去的20年中,自下而上的视觉显着性建模一直是众多研究工作的主题,并且在计算机视觉和机器人技术中有许多成功的应用。使用不同的评估得分(例如搜索斜率,与人眼跟踪的比较)和参数设置,已使用不同的数据集(例如合成心理搜索数组,自然图像或视频)对可用模型进行了测试。这使得直接比较模型变得困难。在这里,我们使用三个评估得分,对54个具有挑战性的合成模式,三个自然图像数据集和两个视频数据集进行了35种最新显着性模型的详尽比较。我们发现,尽管模型排名有所不同,但某些模型始终表现更好。对数据集的分析表明,现有数据集偏重于中心,这会影响一些评估得分。计算复杂度分析表明,某些模型非常快,但具有可观的眼动预测准确性。不同的模型通常具有共同的易/难刺激。此外,讨论了视觉显着性建模,眼动数据集和评估得分中的一些问题,并提供了对未来工作的见识。我们的研究使人们能够评估最新技术,帮助组织这一迅速发展的领域,并建立统一的比较框架以衡量未来的工作,类似于对象识别和检测领域中的PASCAL VOC挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号