首页> 中文期刊> 《计算机与现代化》 >基于多层类别主题图模型的教育文本分类方法

基于多层类别主题图模型的教育文本分类方法

         

摘要

There are more and more education resource of information in the period of big data on the Web .The classification re-quirement of a great number of education texts of being multi-class, multi-level can be satisfied by hierachical classification . Therefore, the class representation model of traditional hierachical classification has high -dimension and sparse problem , and it’ s lack of semantic understanding .To solve the above problems , the classification method of education text based on hierachical class topic graph model was proposed .The text set was modelled by the hierachical class topic graph model .Probability matrices of hierachical class-word of the texts were obtained .In order to further improve correlation between feature words and classes , the combined feature was extracted by the complementarity of the three kinds of way extracting feature .Finally, the texts were classi-fied by the hierachical SVM classifier .The analysis on simulation result indicates that the evaluation values of MacroP , MacroR and MacroF1 etc increase to some extend , comparing to traditional hierachical classification method .Therefore the method has good classification effect of Internet education text , and application prospect .%在互联网大数据时代,网络教育资源信息以爆炸式增长。层次分类能满足大规模教育文本多类别、多层次的分类要求,但传统层次分类的类别表示模型存在向量高维稀疏、缺乏语义理解等问题。针对以上问题,提出一种基于多层类别主题图模型的教育文本分类方法。该方法通过多层类别主题图模型对文本集进行建模,得到文本的多层类别-词项概率矩阵;利用3种特征提取方法的互补性进行组合特征提取,进一步提高特征词和主题类别关联度;利用多层SVM分类器进行分类。实验结果表明,该方法在性能上与传统的多层文本分类方法相比,宏平均MacroP、MacroR和MacroF1等评估值都有一定的提高,具有较好的网络教育文本分类效果和应用前景。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号