首页> 外文会议>International Conference on Advanced Electronic Materials, Computers and Software Engineering >Application Research of Text Classification Based on Random Forest Algorithm
【24h】

Application Research of Text Classification Based on Random Forest Algorithm

机译:基于随机森林算法的文本分类应用研究

获取原文

摘要

In view of the poor classification effect of traditional random forest algorithm due to the low quality of text feature extraction, a random forest method for text information is proposed. In view of the difficulty in controlling the quality of traditional random forest decision trees, a weighted voting mechanism is proposed to improve the quality of decision trees. This algorithm uses tr-k method based on text feature extraction to improve the quality and diversity of text features, and uses the latest Bert word vector generation model to represent the text. Experimental data in Python environment show that this method can achieve better results in text classification than IDF based random forest algorithm and original random forest algorithm.
机译:针对传统随机森林算法由于文本特征提取质量差而分类效果不佳的问题,提出了一种文本信息的随机森林方法。针对传统随机森林决策树质量难以控制的问题,提出了一种加权投票机制来提高决策树的质量。该算法使用基于文本特征提取的tr-k方法来提高文本特征的质量和多样性,并使用最新的Bert词向量生成模型来表示文本。在Python环境下的实验数据表明,与基于IDF的随机森林算法和原始随机森林算法相比,该方法在文本分类中可以获得更好的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号