首页> 外文期刊>Journal of Electrical Systems and Information Technology >Improve the automatic classification accuracy for Arabic tweets using ensemble methods
【24h】

Improve the automatic classification accuracy for Arabic tweets using ensemble methods

机译:使用集成方法提高阿拉伯文推文的自动分类准确性

获取原文
           

摘要

Tweets classification became interest topics in recent years, especially for the Arabic language. In this paper, the Arabic tweets are classified automatically into one of some predetermined categories mainly: sport, culture, politics, technology and general, based on their linguistic characteristics and their contents, also the classification accuracy is improved for Arabic tweets, by using ensemble methods mainly: bagging, boosting and stacking on the same dataset that we used it before in the classification, to verify of the results, and identify the best classifier gives high accuracy. The experimental results showed that using ensemble methods are better than using individual classifier, to improve the accuracy of classification. Increased accuracy of classifier Na?ve Bayes (NB) to 1.6%, classifier Sequential Minimal Optimization (SMO) to 2.2% and finally Decision Tree (J48) classifier reached up to 3.2%, comparing to using the J48, NB, or SMO as a single classifier.
机译:推文分类近年来成为人们关注的话题,尤其是阿拉伯语。本文将阿拉伯语推文根据其语言特征和内容自动分类为体育,文化,政治,技术和一般性等预定类别之一,并通过集成提高了阿拉伯语推文的分类精度。方法主要是:对我们之前在分类中使用过的同一数据集进行装袋,增强和堆叠,以验证结果并确定最佳分类器可提供较高的准确性。实验结果表明,采用集成方法优于使用单个分类器,可以提高分类的准确性。与使用J48,NB或SMO相比,将分类器朴素贝叶斯(NB)的准确性提高到1.6%,将分类器顺序最小优化(SMO)的准确性提高到2.2%,最后决策树(J48)分类器的准确性达到3.2%。单个分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号