...
首页> 外文期刊>Engineering Applications of Artificial Intelligence >Deep feature weighting for naive Bayes and its application to text classification
【24h】

Deep feature weighting for naive Bayes and its application to text classification

机译:朴素贝叶斯的深度特征加权及其在文本分类中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Naive Bayes (NB) continues to be one of the top 10 data mining algorithms due to its simplicity, efficiency and efficacy. Of numerous proposals to improve the accuracy of naive Bayes by weakening its feature independence assumption, the feature weighting approach has received less attention from researchers. Moreover, to our knowledge, all of the existing feature weighting approaches only incorporate the learned feature weights into the classification of formula of naive Bayes and do not incorporate the learned feature weights into its conditional probability estimates at all. In this paper, we propose a simple, efficient, and effective feature weighting approach, called deep feature weighting (DFW), which estimates the conditional probabilities of naive Bayes by deeply computing feature weighted frequencies from training data. Empirical studies on a collection of 36 benchmark datasets from the UCI repository show that naive Bayes with deep feature weighting rarely degrades the quality of the model compared to standard naive Bayes and, in many cases, improves it dramatically. Besides, we apply the proposed deep feature weighting to some state-of-the-art naive Bayes text classifiers and have achieved remarkable improvements.
机译:朴素贝叶斯(NB)凭借其简单性,效率和功效,仍然是十大数据挖掘算法之一。在通过削弱其特征独立性假设来提高朴素贝叶斯准确性的众多提议中,特征加权方法受到研究人员的关注较少。此外,据我们所知,所有现有特征加权方法仅将学习到的特征权重合并到朴素贝叶斯公式的分类中,而根本没有将学习到的特征权重合并到其条件概率估计中。在本文中,我们提出了一种简单,有效的有效特征加权方法,称为深度特征加权(DFW),该方法通过从训练数据中深度计算特征加权频率来估计朴素贝叶斯的条件概率。对来自UCI存储库的36个基准数据集的经验研究表明,与标准朴素贝叶斯相比,具有较深特征权重的朴素贝叶斯很少降低模型的质量,并且在许多情况下,它可以显着改善模型。此外,我们将拟议的深度特征加权应用于一些最先进的朴素贝叶斯文本分类器,并已取得显着改进。

著录项

  • 来源
  • 作者单位

    Department of Computer Science, China University of Geosciences, Wuhan 430074, China,Hubei Key Laboratory of Intelligent Geo-Information Processing, China University of Geosciences, Wuhan 430074, China;

    Department of Mathematics, China University of Geosciences, Wuhan 430074, China;

    Department of Computer Science, China University of Geosciences, Wuhan 430074, China,Hubei Key Laboratory of Intelligent Geo-Information Processing, China University of Geosciences, Wuhan 430074, China;

    Department of Computer Science, China University of Geosciences, Wuhan 430074, China,Hubei Key Laboratory of Intelligent Geo-Information Processing, China University of Geosciences, Wuhan 430074, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Naive Bayes; Feature weighting; Correlation-based feature selection; Text classification;

    机译:朴素贝叶斯;特征权重;基于相关的特征选择;文字分类;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号