首页> 外文会议>Web technologies and applications >Milling the Discriminative Word Sets for Bag-of-Words Model Based on Distributional Similarity Graph
【24h】

Milling the Discriminative Word Sets for Bag-of-Words Model Based on Distributional Similarity Graph

机译:基于分布相似度图的词袋模型判别词集的铣削

获取原文
获取原文并翻译 | 示例

摘要

Most of the previous distributional clustering methods are fundamentally unsupervised, and the discriminative property of words is not well modeled in the clustering procedure. In this paper, we propose a supervised model which involves the class conditional probability in measuring the word similarity, and transform the word-set extraction to a supervised graph-partition optimization model. A greedy algorithm is proposed to solve this model, which combines the word selecting method and the word grouping method in the unified framework. By grouping the related words, this method essentially transforms the exact match between word bins to fuzzy match between groups of related-word bins, which to some extent avoid the synonymous problems in BoW model. Experiments on data sets demonstrate that the proposed method is applicable for both text sets and image sets, and has advantages in producing better retrieval precision and meanwhile reducing the lexicon size.
机译:以前的大多数分布聚类方法基本上都是无监督的,并且在聚类过程中不能很好地建模单词的判别属性。在本文中,我们提出了一种监督模型,该模型涉及类条件条件概率来测量单词相似度,并将单词集提取转换为监督图分区优化模型。提出了一种贪婪算法来解决该模型,该算法在统一的框架中将选词方法和分组词方法结合在一起。通过对相关词进行分组,该方法从本质上将词库之间的精确匹配转换为相关词库组之间的模糊匹配,这在某种程度上避免了BoW模型中的同义词问题。在数据集上进行的实验表明,该方法既适用于文本集,又适用于图像集,具有检索精度高,词典大小减小的优点。

著录项

  • 来源
  • 会议地点 Guangzhou(CN)
  • 作者

    Wen Wen; Zhifeng Hao; Ruichu Cai;

  • 作者单位

    School of Computer, Guangdong University of Technology, Guangdong, China;

    School of Computer, Guangdong University of Technology, Guangdong, China;

    School of Computer, Guangdong University of Technology, Guangdong, China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号