首页> 外文会议>Conference on Internet Multimedia Management Systems Ⅲ, Jul 31-Aug 1, 2002, Boston, USA >Towards Semantic-based Retrieval of Visual Information: A Model-based approach
【24h】

Towards Semantic-based Retrieval of Visual Information: A Model-based approach

机译:迈向基于语义的视觉信息检索:一种基于模型的方法

获取原文
获取原文并翻译 | 示例

摘要

This paper centers around the problem of automated visual content classification. To enable classification based image or visual object retrieval, we propose a new image representation scheme called visual context descriptor (VCD) that is a multidimensional vector in which each element represents the frequency of a unique visual property of an image or a region. VCD utilizes the predetermined quality dimensions (i.e., types of features and quantization level) and semantic model templates mined in priori. Not only observed visual cues, but also contextually relevant visual features are proportionally incorporated in VCD. Contextual relevance of a visual cue to a semantic class is determined by using correlation analysis of ground truth samples. Such co-occurrence analysis of visual cues requires transformation of a real-valued visual feature vector (e.g., color histogram, Gabor texture, etc.,) into a discrete event (e.g., terms in text). Good-feature to track, rule of thirds, iterative k-means clustering and TSVQ are involved in transformation of feature vectors into unified symbolic representations called visual terms. Similarity-based visual cue frequency estimation is also proposed and used for ensuring the correctness of model learning and matching since sparseness of sample data causes the unstable results of frequency estimation of visual cues. The proposed method naturally allows integration of heterogeneous visual or temporal or spatial cues in a single classification or matching framework, and can be easily integrated into a semantic knowledge base such as thesaurus, and ontology. Robust semantic visual model template creation and object based image retrieval are demonstrated based on the proposed content description scheme.
机译:本文围绕自动视觉内容分类问题。为了实现基于分类的图像或视觉对象检索,我们提出了一种称为视觉上下文描述符(VCD)的新图像表示方案,该方案是多维向量,其中每个元素代表图像或区域的唯一视觉属性的频率。 VCD利用预先确定的预定质量尺寸(即特征类型和量化级别)和语义模型模板。不仅将观察到的视觉提示,而且与上下文相关的视觉特征也按比例地包含在VCD中。视觉提示与语义类的上下文相关性是通过使用地面事实样本的相关性分析确定的。这种对视觉提示的共现分析需要将实值视觉特征向量(例如颜色直方图,Gabor纹理等)转换为离散事件(例如文本中的术语)。良好的跟踪功能,三分法则,迭代k均值聚类和TSVQ涉及将特征向量转换为称为视觉术语的统一符号表示形式。由于样本数据的稀疏性导致视觉提示频率估计的不稳定结果,因此提出了基于相似度的视觉提示频率估计,并用于确保模型学习和匹配的正确性。所提出的方法自然地允许将异类视觉或时间或空间线索集成在单个分类或匹配框架中,并且可以容易地集成到诸如同义词库和本体的语义知识库中。基于提出的内容描述方案,演示了鲁棒的语义视觉模型模板创建和基于对象的图像检索。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号