...
首页> 外文期刊>Expert Systems with Application >Modeling spatial layout for scene image understanding via a novel multiscale sum-product network
【24h】

Modeling spatial layout for scene image understanding via a novel multiscale sum-product network

机译:通过新颖的多尺度求和积网络对空间布局进行建模以了解场景图像

获取原文
获取原文并翻译 | 示例
           

摘要

Semantic image segmentation is challenging due to the large intra-class variations and the complex spatial layouts inside natural scenes. This paper investigates this problem by designing a new deep architecture, called multiscale sum-product network (MSPN), which utilizes multiscale unary potentials as the inputs and models the spatial layouts of image content in a hierarchical manner. That is, the proposed MSPN models the joint distribution of multiscale unary potentials and object classes instead of single unary potentials in popular settings. Besides, MSPN characterizes scene spatial layouts in a fine-to-coarse manner to enforce the consistency in labeling. Multiscale unary potentials at different scales can thus help overcome semantic ambiguities caused by only evaluating single local regions, while long-range spatial correlations can further refine image labeling. In addition, higher orders are able to pose the constraints among labels, By this way, multi-scale unary potentials, long-range spatial correlations, higher-order priors are well modeled under the uniform framework in MSPN. We conduct experiments on two challenging benchmarks consisting of the MSRC-21 dataset and the SIFT FLOW dataset. The results demonstrate the superior performance of our method comparing with the previous graphical models for understanding scene images. (C) 2016 Elsevier Ltd. All rights reserved.
机译:由于类内差异较大以及自然场景内部的空间布局复杂,因此语义图像分割具有挑战性。本文通过设计一种称为多尺度和乘积网络(MSPN)的新的深层体系结构来研究此问题,该体系结构利用多尺度一元电势作为输入并以分层方式对图像内容的空间布局进行建模。也就是说,提出的MSPN可以模拟多尺度一元势和对象类的联合分布,而不是在流行环境中的单元势。此外,MSPN以精细到粗略的方式表征场景空间布局,以增强标签的一致性。因此,不同尺度的多尺度一元电势可以帮助克服仅评估单个局部区域而引起的语义歧义,而远程空间相关可以进一步完善图像标记。此外,高阶能够在标签之间施加约束。这样,在MSPN的统一框架下就可以很好地建模多尺度一元势,远程空间相关性,高阶先验。我们在由MSRC-21数据集和SIFT FLOW数据集组成的两个具有挑战性的基准上进行实验。结果表明,与用于理解场景图像的先前图形模型相比,我们的方法具有优越的性能。 (C)2016 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号