首页> 外文期刊>Knowledge-Based Systems >Siamese attentional keypoint network for high performance visual tracking
【24h】

Siamese attentional keypoint network for high performance visual tracking

机译:连体注意力关键点网络,用于高性能视觉跟踪

获取原文
获取原文并翻译 | 示例
           

摘要

Visual tracking is one of the most fundamental topics in computer vision. Numerous tracking approaches based on discriminative correlation filters or Siamese convolutional networks have attained remarkable performance over the past decade. However, it is still commonly recognized as an open research problem to develop robust and effective trackers which can achieve satisfying performance with high computational and memory storage efficiency in real-world scenarios. In this paper, we investigate the impacts of three main aspects of visual tracking, i.e., the backbone network, the attentional mechanism, and the detection component, and propose a Siamese Attentional Keypoint Network, dubbed SATIN, for efficient tracking and accurate localization. Firstly, a new Siamese lightweight hourglass network is specially designed for visual tracking. It takes advantage of the benefits of the repeated bottom-up and top-down inference to capture more global and local contextual information at multiple scales. Secondly, a novel cross-attentional module is utilized to leverage both channel-wise and spatial intermediate attentional information, which can enhance both discriminative and localization capabilities of feature maps. Thirdly, a keypoints detection approach is invented to trace any target object by detecting the top-left corner point, the centroid point, and the bottom-right corner point of its bounding box. Therefore, our SATIN tracker not only has a strong capability to learn more effective object representations, but also is computational and memory storage efficiency, either during the training or testing stages. To the best of our knowledge, we are the first to propose this approach. Without bells and whistles, experimental results demonstrate that our approach achieves state-of-the-art performance on several recent benchmark datasets, at a speed far exceeding 27 frames per second. (c) 2019 Elsevier B.V. All rights reserved.
机译:视觉跟踪是计算机视觉中最基本的主题之一。在过去的十年中,基于判别相关滤波器或暹罗卷积网络的许多跟踪方法都取得了卓越的性能。但是,开发健壮而有效的跟踪器仍然是公认的开放研究问题,这些跟踪器可以在实际场景中以较高的计算和内存存储效率实现令人满意的性能。在本文中,我们研究了视觉跟踪的三个主要方面的影响,即骨干网,注意力机制和检测组件,并提出了一个称为SATIN的暹罗注意力关键点网络,以进行有效的跟踪和准确的定位。首先,新的暹罗轻量级沙漏网络是专门为视觉跟踪而设计的。它利用重复的自下而上和自上而下的推理的优势,以多种规模捕获更多的全局和局部上下文信息。其次,利用一种新颖的交叉注意模块来利用通道方向和空间中间注意信息,这可以增强特征图的判别能力和定位能力。第三,发明了一种关键点检测方法,通过检测其边界框的左上角点,质心点和右下角点来跟踪任何目标对象。因此,无论是在训练还是测试阶段,我们的SATIN追踪器不仅具有学习更有效的对象表示的强大能力,而且还具有计算和内存存储效率。据我们所知,我们是第一个提出这种方法的人。实验结果表明,在没有麻烦的情况下,我们的方法能够以超过27帧/秒的速度在几个最新的基准数据集上实现最先进的性能。 (c)2019 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Knowledge-Based Systems》 |2020年第6期|105448.1-105448.16|共16页
  • 作者

  • 作者单位

    Harbin Inst Technol Sch Elect & Informat Engn Shenzhen Peoples R China|Harbin Inst Technol Sch Astronaut Harbin Peoples R China;

    Harbin Inst Technol Sch Elect & Informat Engn Shenzhen Peoples R China;

    Univ Granada Andalusian Res Inst Data Sci & Computat Intellige Granada Spain|Iwate Prefectural Univ Fac Software & Informat Sci Takizawa Iwate Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Visual tracking; Siamese hourglass networks; Cross-attentional module; Keypoint detection;

    机译:视觉跟踪;连体沙漏网络;交叉注意模块;关键点检测;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号