首页> 外文会议>IEEE International Conference on Control Science and Systems Engineering >An Improved Naive Bayesian Classification Algorithm for Massive Data
【24h】

An Improved Naive Bayesian Classification Algorithm for Massive Data

机译:一种改进的大型数据贝叶斯分类算法

获取原文

摘要

For the low speed and accuracy in massive data classification, an improved Naive Bayesian classification algorithm for mass data processing is proposed. Firstly, feature rough clustering is carried out to cluster the features to reduce the computational complexity of feature association. Secondly, the association rules algorithm is used to mine frequent item sets of rough clustering subsets, and the generated frequent item sets are used to filter the features based on the result of classification. And then, the feature set after feature selection is weighted to improve the accuracy. Finally, the improved algorithm is implemented on the MapReduce parallelization platform and tested with five data sets of different sizes. The experimental results show that the improved algorithm in this paper could save a lot of running time when dealing with large-scale data sets, and maintain high accuracy.
机译:对于大规模数据分类的低速和准确性,提出了一种改进的朴素贝叶斯分类算法,用于质量数据处理。首先,进行特征粗群以进行聚类,以降低特征关联的计算复杂性。其次,关联规则算法用于常用的粗群集群集合频繁的粗簇子集,并且生成的频繁项目集用于基于分类结果来过滤这些功能。然后,重量特征选择后的功能设置以提高精度。最后,改进的算法在MapreduceParleastization平台上实现,并用五种不同大小的数据集进行了测试。实验结果表明,在处理大规模数据集时,本文的改进算法可以节省大量运行时间,并保持高精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号