首页> 外国专利> KEYWORD RECOMMENDATION METHOD AND SYSTEM BASED ON LATENT DIRICHLET ALLOCATION MODEL

KEYWORD RECOMMENDATION METHOD AND SYSTEM BASED ON LATENT DIRICHLET ALLOCATION MODEL

机译:基于离散狄利克雷分配模型的关键词推荐方法及系统

摘要

Keyword recommendation methods and systems based on a latent Dirichlet allocation (LDA) model. The method comprises: calculating a basic Dirichlet allocation model for training texts; obtaining an incremental seed word, and selecting from the training texts a training text matching the incremental seed word to serve as an incremental training text; calculating an incremental Dirichlet allocation model for the incremental training text; obtaining a probability distribution of complete words to topics and a probability distribution of complete texts to topics; calculating a relevance score between the complete word and any other complete word respectively to obtain a relevance score between every two complete words; and determining, according to an obtained query word and the obtained relevance score between every two complete words, a keyword corresponding to the query word. By employing an incremental training model, the present invention greatly improves the precision of topic clustering and topic diversity, and significantly improves the quality of keywords in the topics.
机译:基于潜在狄利克雷分配(LDA)模型的关键字推荐方法和系统。该方法包括:计算用于训练文本的基本狄利克雷分配模型;以及获取增量种子词,从训练文本中选择与增量种子词匹配的训练文本作为增量训练文本;为增量训练文本计算增量Dirichlet分配模型;获得针对主题的完整单词的概率分布以及针对主题的完整文本的概率分布;分别计算完整词与任何其他完整词之间的相关性得分,以得到每两个完整词之间的相关性得分;根据获取的查询词和获取的每两个完整词之间的相关度得分,确定查询词对应的关键词。通过采用增量训练模型,本发明大大提高了主题聚类和主题多样性的准确性,并显着提高了主题中关键词的质量。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号