首页> 外文会议>IEEE International Conference on Data Mining >Diverse Topic Phrase Extraction through Latent Semantic Analysis
【24h】

Diverse Topic Phrase Extraction through Latent Semantic Analysis

机译:通过潜在语义分析的多种主题短语提取

获取原文

摘要

We propose a novel algorithm for extracting diverse topic phrases in order to provide summary for large corpora. Previous works often ignore the importance of diversity and thus extract phrases crowded on some hot topics while failing to cover other less obvious but important topics. We solve this problem through document re-weighting and phrase diversification by using latent semantic analysis (LSA). Experiments on various datasets show that our new algorithm can improve relevance as well as diversity over different topics for topic phrase extraction problems.
机译:我们提出了一种提取多种主题短语的新颖算法,以便为大型语料库提供摘要。以前的作品往往忽略了多样性的重要性,从而提取短语在一些热门话题上挤满了一些热门话题,同时未能涵盖其他不太明显但重要的主题。我们通过使用潜在语义分析(LSA)通过文档重码和短语多样化来解决这个问题。各种数据集的实验表明,我们的新算法可以提高相关的相关性以及多种主题的相关主题提取问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号