首页> 外文会议>International Joint Conference on Web Intelligence and Intelligent Agent Technology >Supervised Scene Boundary Detection with Relational and Sequential Information
【24h】

Supervised Scene Boundary Detection with Relational and Sequential Information

机译:具有关系和顺序信息的监督场景边界检测

获取原文

摘要

This paper proposes a novel scene boundary detector by considering different features appropriate for definition changes of scenes according to target services or tasks. In the proposed method, the information in shots is categorized into two groups: relational and sequential information. Relational information is acquired by the multi-layered convolution neural networks by merging and embedding similarity vectors from visual and audio features. Sequential information that contains particular patterns of continuous shots is handled with dual recurrent neural networks. The different definitions of scenes are reflected in the proposed method by supervised parameter estimation with a sampling method. Scene boundaries are rarely observed in video content. Thus, it results in skewed class distribution. The sampling method tries to expand instances in scene boundary using reverse order shots, while it reduces the number of non-boundary shots by variance preserved shot filtering. A focal loss is finally adopted for the training process to lead better parameters from an imbalanced dataset. The proposed method is evaluated with three datasets constructed with real-world movies. We empirically proved that different definitions of scene boundary could affect the performance of scene boundary detection through experiments. The proposed deep neural networks with both relational and sequential information show the ability to handle diverse scene definitions in experiments. With supervised learning, the proposed method can reflect the definition bias in each dataset. As a result, the proposed method shows its effectiveness in handling different types of information and adopting other scene definitions by achieving state-of-the-art performances in two benchmark datasets.
机译:本文提出了一种新颖的场景边界检测器,通过考虑适合于定义场景的定义根据目标服务或任务的改变。在所提出的方法中,拍摄中的信息分为两组:关系和顺序信息。通过从视觉和音频功能的相似向量合并和嵌入相似性向量,由多层卷积神经网络获取关系信息。包含连续镜头特定模式的顺序信息由双重经常性神经网络处理。通过使用采样方法监督参数估计,在所提出的方法中反映了场景的不同定义。在视频内容中很少观察到场景边界。因此,它导致偏斜类分布。采样方法尝试使用相反阶射击展开场景边界中的实例,而通过方差保存射击滤波减少了非边界拍摄的数量。训练过程终于采用了焦点损失,以从不平衡数据集中引导更好的参数。建议的方法用三个与现实世界电影构建的三个数据集进行评估。我们经常证明了场景边界的不同定义可能会通过实验影响场景边界检测的性能。具有关系和顺序信息的建议的深度神经网络,显示了在实验中处理不同场景定义的能力。通过监督学习,所提出的方法可以反映每个数据集中的定义偏差。结果,该方法通过在两个基准数据集中实现最先进的性能来实现其处理不同类型的信息并采用其他场景定义的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号