...
首页> 外文期刊>EURASIP journal on applied signal processing >Interaction between high-level and low-level image analysis for semantic video object extraction
【24h】

Interaction between high-level and low-level image analysis for semantic video object extraction

机译:语义视频对象提取的高级和低级图像分析之间的交互

获取原文
获取原文并翻译 | 示例
           

摘要

The task of extracting a semantic video object is split into two subproblems, namely, object segmentation and region segmentation. Object segmentation relies on a priori assumptions, whereas region segmentation is data-driven and can be solved in an automatic manner. These two subproblems are not mutually independent, and they can benefit from interactions with each other. In this paper, a framework for such interaction is formulated. This representation scheme based on region segmentation and semantic segmentation is compatible with the view that image analysis and scene understanding problems can be decomposed into low-level and high-level tasks. Low-level tasks pertain to region-oriented processing, whereas the high-level tasks are closely related to object-level processing. This approach emulates the human visual system: what one "sees" in a scene depends on the scene itself (region segmentation) as well as on the cognitive task (semantic segmentation) at hand. The higher-level segmentation results in a partition corresponding to semantic video objects. Semantic video objects do not usually have invariant physical properties and the definition depends on the application. Hence, the definition incorporates complex domain-specific knowledge and is not easy to generalize. For the specific implementation used in this paper, motion is used as a clue to semantic information. In this framework, an automatic algorithm is presented for computing the semantic partition based on color change detection. The change detection strategy is designed to be immune to the sensor noise and local illumination variations. The lower-level segmentation identifies the partition corresponding to perceptually uniform regions. These regions are derived by clustering in an N-dimensional feature space, composed of static as well as dynamic image attributes. We propose an interaction mechanism between the semantic and the region partitions which allows to cope with multiple simultaneous objects. Experimental results show that the proposed method extracts semantic video objects with high spatial accuracy and temporal coherence.
机译:提取语义视频对象的任务分为两个子问题,即对象分割和区域分割。对象分割依赖于先验假设,而区域分割是数据驱动的,可以自动解决。这两个子问题不是相互独立的,它们可以从彼此的交互中受益。在本文中,制定了这种交互的框架。这种基于区域分割和语义分割的表示方案与以下观点兼容:图像分析和场景理解问题可以分解为低级和高级任务。低级任务与面向区域的处理有关,而高级任务与对象级处理密切相关。这种方法模拟了人类的视觉系统:场景中的“看见”取决于场景本身(区域分割)以及手头的认知任务(语义分割)。较高级别的分割导致对应于语义视频对象的分区。语义视频对象通常不具有不变的物理属性,其定义取决于应用程序。因此,该定义包含了复杂的特定领域知识,并且很难一概而论。对于本文中使用的特定实现,将运动用作语义信息的线索。在此框架下,提出了一种基于颜色变化检测的自动语义分区计算算法。变化检测策略设计为不受传感器噪声和局部照明变化的影响。较低级别的分割标识对应于感知均匀区域的分区。这些区域是通过在N维特征空间中聚类而得出的,N维特征空间由静态和动态图像属性组成。我们提出了语义和区域分区之间的交互机制,该机制允许处理多个同时存在的对象。实验结果表明,该方法提取的语义视频对象具有较高的空间精度和时间一致性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号