【24h】

Automatic Image Annotation Based on WordNet and Hierarchical Ensembles

机译:基于WordNet和层次集成的自动图像批注。

获取原文
获取原文并翻译 | 示例

摘要

Automatic image annotation concerns a process of automatically labeling image contents with a pre-defined set of keywords, which are regarded as descriptors of image high-level semantics, so as to enable semantic image retrieval via keywords. A serious problem in this task is the unsatisfactory annotation performance due to the semantic gap between the visual content and keywords. Targeting at this problem, we present a new approach that tries to incorporate lexical semantics into the image annotation process. In the phase of training, given a training set of images labeled with keywords, a basic visual vocabulary consisting of visual terms, extracted from the image to represent its content, and the associated keywords is generated at first, using K-means clustering combined with semantic constraints obtained from WordNet, then the statistical correlation between visual terms and keywords is modeled by a two-level hierarchical ensemble model composed of probabilistic SVM classifiers and a co-occurrence language model. In the phase of annotation, given an unla-beled image, the most likely associated keywords are predicted by the posterior probability of each keyword given each visual term at the first-level classifier ensemble, then the second-level language model is used to refine the annotation quality by word co-occurrence statistics derived from the annotated keywords in the training set of images. We carried out experiments on a medium-sized image collection from Corel Stock Photo CDs. The experimental results demonstrated that the annotation performance of this method outperforms some traditional annotation methods by about 7% in average precision, showing the feasibility and effectiveness of the proposed approach.
机译:自动图像注释涉及一种用一组预定义的关键字自动标记图像内容的过程,这些关键字被视为图像高级语义的描述符,以便能够通过关键字检索语义图像。该任务中的一个严重问题是由于视觉内容和关键字之间的语义差距而导致的注释性能不令人满意。针对此问题,我们提出了一种新方法,该方法试图将词法语义纳入图像注释过程。在训练阶段,给定一组训练有关键词的图像,然后从图像中提取一个代表视觉术语的基本视觉词汇,以表示其内容,并首先使用K均值聚类和从WordNet获得语义约束,然后通过由概率SVM分类器和共现语言模型组成的两级分层集成模型对视觉术语与关键字之间的统计相关性进行建模。在注释阶段,给定一张无带图像,最可能关联的关键字通过在第一级分类器集合中给定每个视觉术语的每个关键字的后验概率来预测,然后使用第二级语言模型进行细化从图像训练集中的带注释的关键字得出的词共现统计的注释质量。我们对来自Corel Stock Photo CD的中型图像集进行了实验。实验结果表明,该方法的注释性能平均精度优于传统注释方法约7%,说明了该方法的可行性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号