首页> 外文会议>Asian conference on computer vision >Hierarchical Space Tiling for Scene Modeling
【24h】

Hierarchical Space Tiling for Scene Modeling

机译:用于场景建模的分层空间铺平

获取原文

摘要

A typical scene category, e.g., street and beach, contains an enormous number (e.g., in the order of 10~4 to 10~8) of distinct scene configurations that are composed of objects and regions of varying shapes in different layouts. A well-known representation that can effectively address such complexity is the family of compositional models; however, learning the structures of the hierarchical compositional models remains a challenging task in vision. The objective of this paper is to present an efficient method for learning such models from a set of scene configurations. We start with an over-complete representation called Hierarchical Space Tiling (HST), which quantizes the huge and continuous scene configuration space in an And-Or tree (AOT). This hierarchical AOT can generate a combinatorial number of configurations (in the order of 10~(31)) through a small dictionary of elements. Then we estimate the HST/AOT model through a learning-by-parsing strategy, which iteratively updates the HST/AOT parameters while constructing the optimal parse trees for each training configuration. Finally we prune out the branches with zero or low probability to obtain a much smaller HST/AOT. The HST quantization allows us to transfer the challenging structure-learning problem to a tractable parameter-learning problem. We evaluate the representation in three aspects, (ⅰ) Coding efficiency. We show the learned representation can approximate valid configurations with less errors using smaller number of primitives than other popular representations, (ⅱ) Semantic power of learning. The learned representation is less ambiguous in parsing configuration and has semantically meaningful inner concepts. It captures both the diversity and the frequency (prior) of the scene configurations. (ⅲ) Scene classification. The model is not only fully generative but also yields discriminative scene classification performance which outperforms the state-of-the-art methods.
机译:典型的场景类别,例如街道和海滩包含一个巨大的数量(例如,10〜4到10〜10〜10),其由不同布局中不同形状的物体和区域组成的不同场景配置。可以有效地解决这些复杂性的众所周知的表示是组成模型的家族;然而,学习分层组成模型的结构仍然是视力中的具有挑战性的任务。本文的目的是提供一种从一组场景配置学习此类模型的有效方法。我们从一个被称为分层空间划线(HST)的完整表示,它在和或树(AOT)中量化巨大和连续的场景配置空间。该分层AOT可以通过小型元素词典来生成组合配置的配置(按10〜(31))。然后,我们通过解析策略来估计HST / AOT模型,该策略迭代地更新HST / AOT参数,同时为每个训练配置构造最佳解析树。最后,我们将分支机分布零或低概率来获得更小的HST / AOT。 HST量化使我们能够将挑战的结构学习问题转移到易诊的参数学习问题。我们评估了三个方面的代表性,(Ⅰ)编码效率。我们展示学习的表示可以近似有效配置,使用比其他流行表示的较少数量的原语,(Ⅱ)学习的语义力量较少的误差。在解析配置中,学习的表示在不那么模糊,并且具有语义有意义的内心概念。它捕获了场景配置的分集和频率(先前)。 (Ⅲ)场景分类。该模型不仅完全是生成的,而且产生了差异的场景分类性能,这些性能优于最先进的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号