首页> 中文期刊> 《计算机技术与发展》 >基于隐含狄利克雷模型的文献主题演化预测

基于隐含狄利克雷模型的文献主题演化预测

         

摘要

According to the change of the topic of scientific papers in previous years,to analyze the evolution of scientific papers based on Latent Dirichlet Allocation (LDA) is the current research focus. Through the aftereffect for topic evolution of scientific paper,Markov Chain is used to predict the evolution information of topic. In this method,LDA is used first to obtain the topics in different time win-dows,then some calculation method like similarity is used to associate with topics in neighboring time window. According to the intensity of topics,these topics are divided into 3 states including popular,normal and cold. Finally,the state transition matrix which is gained by the Markov Chain is used to analyze and forecast the trend of topic evolution. The experiment on proceedings of NIPS shows that after a long period evolution,the state proportion of topics of scientific papers is stabilized,with hot 30%,normal 60% and cold 10% remained, which shows that this method can effectively predict the trend of topic evolution in the next few years according to the existing evolution-ary information.%利用隐含狄利克雷分配模型( LDA),根据科技文献往年的主题变化来分析科技文献主题的演化,是目前主题演化研究的热点。根据科技论文的主题演化具有无后效性的特点,使用马尔可夫链来预测主题的演化信息。该方法利用LDA模型获取不同时段的主题,使用相似度等方法对相邻时间窗口的主题进行关联,并根据主题的强度将主题分为热门主题、普通主题和冷门主题,最后利用马尔可夫链得到主题之间的强度转移概率矩阵,对主题的强度变化趋势进行分析和预测。对NIPS论文集进行实验表明,科技论文主题在长时间演化后,其状态占比趋于稳定,热门主题、普通主题和冷门主题占比将保持在30%、60%和10%左右。说明该方法能有效地根据现有的主题演化结果对主题在未来几年的演化信息进行预测。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号