首页> 外文OA文献 >Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

【2h】

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

机译：注意力集群：基于纯粹的关注视频分类的本地特征集成

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, substantial research effort has focused on how to apply CNNs orRNNs to better extract temporal patterns from videos, so as to improve theaccuracy of video classification. In this paper, however, we show that temporalinformation, especially longer-term patterns, may not be necessary to achievecompetitive results on common video classification datasets. We investigate thepotential of a purely attention based local feature integration. Accounting forthe characteristics of such features in video classification, we propose alocal feature integration framework based on attention clusters, and introducea shifting operation to capture more diverse signals. We carefully analyze andcompare the effect of different attention mechanisms, cluster sizes, and theuse of the shifting operation, and also investigate the combination ofattention clusters for multimodal integration. We demonstrate the effectivenessof our framework on three real-world video classification datasets. Our modelachieves competitive results across all of these. In particular, on thelarge-scale Kinetics dataset, our framework obtains an excellent single modelaccuracy of 79.4% in terms of the top-1 and 94.0% in terms of the top-5accuracy on the validation set. The attention clusters are the backbone of ourwinner solution at ActivityNet Kinetics Challenge 2017. Code and models will bereleased soon.

机译：最近，实质性的研究努力专注于如何应用CNNS Orrnns来更好地提高视频的时间模式，从而提高视频分类的TheAcuracy。然而，在本文中，我们表明，在普通视频分类数据集上，可能不需要临时信息，尤其是长期模式。我们调查了基于纯粹的本地特征集成的ThePotential。在视频分类中进行此类特征的特征，我们提出了基于注意力集群的综合特征集成框架，并介绍了捕获更多样化的信号。我们仔细分析了不同关注机制，群集尺寸和函数换档操作的效果，并研究了对多模式集成的群集组合。我们展示了我们在三个现实世界视频分类数据集中框架的有效性。我们的Modelachieves所有这些都竞争效果。特别是，上thelarge规模动力学数据集，我们的框架获得的79.4％在顶部-1和94.0％的条件在所述验证集顶5accuracy方面优异的单modelaccuracy。注意集群是2017年ActivityNet Kinetics挑战赛的我们的Weinner解决方案的骨干。代码和模型将很快削弱。

著录项

作者
Xiang Long; Chuang Gan; Gerard de Melo; Jiajun Wu; Xiao Liu; Shilei Wen;
展开▼
作者单位

展开▼
年度 2018
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. VideoWhisper: Toward Discriminative Unsupervised Video Feature Learning With Attention-Based Recurrent Neural Networks [J] . Na Zhao, Hanwang Zhang, Richang Hong, IEEE transactions on multimedia . 2017,第9期

机译：VideoWhisper：通过基于注意力的递归神经网络实现区分性无监督视频特征学习
2. Deep Feature Fusion with Integration of Residual Connection and Attention Model for Classification of VHR Remote Sensing Images [J] . Jicheng Wang, Li Shen, Wenfan Qiao, Remote Sensing . 2019,第13期

机译：深度特征融合与残差连接和注意力模型的集成，用于VHR遥感图像分类
3. Feature integration in basic detection and localization tasks: Insights from the attentional orienting literature [J] . Huffman Greg, Hilchey Matthew D., Pratt Jay Attention, perception & psychophysics . 2018,第6期

机译：基本检测和本地化任务中的功能集成：注意事件思想文学的见解
4. Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification [C] . Xiang Long, Chuang Gan, Gerard de Melo, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2018

机译：注意集群：基于纯粹注意的局部特征集成，用于视频分类
5. Exploring image and video by classification and clustering on global and local visual features. [D] . Lu, Le. 2007

机译：通过对全局和局部视觉特征进行分类和聚类来探索图像和视频。
6. Tensor-Based Emotional Category Classification via Visual Attention-Based Heterogeneous CNN Feature Fusion [O] . Yuya Moroto, Keisuke Maeda, Takahiro Ogawa, 2020

机译：通过基于视觉注意的异构CNN特征融合的基于张量的情感类别分类
7. Tensor-Based Emotional Category Classification via Visual Attention-Based Heterogeneous CNN Feature Fusion [O] . Yuya Moroto, Keisuke Maeda, Takahiro Ogawa, 2020

机译：基于卷的情感类别通过基于视觉关注的异构CNN特征融合

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

摘要

著录项

相似文献

相关主题

期刊订阅