首页> 外文会议>International Conference on Advances in Mechanical Engineering and Industrial Informatics >Bagging-Based Logistic Regression With Spark: A Medical Data Mining Method
【24h】

Bagging-Based Logistic Regression With Spark: A Medical Data Mining Method

机译:基于袋装的Logistic回归与Spark:一种医学数据挖掘方法

获取原文

摘要

Medical data in various organizational forms is voluminous and heterogeneous, it is significant to utilize efficient data mining techniques to explore the development rules of diverse diseases. However, many single-node data analysis tools lack enough memory and computing power, therefore, distributed and parallel computing is in great demand. In this paper, we propose a comprehensive medical data mining method consisting of data preprocessing and bagging-based logistic regression with Spark (BLR algorithm) which is improved for better compatibility with Spark, a fast parallel computing framework. Experimental results indicated that although the BLR algorithm took a little more duration than logistic regression (LR), it was 2.12% higher than LR in accuracy and outperformed LR with other common evaluation indexes.
机译:各种组织形式的医疗数据是大量和异质的,利用有效的数据挖掘技术来探索多种疾病的发展规则是很大的。然而,许多单节点数据分析工具缺乏足够的内存和计算能力,因此,分布式和并行计算的需求很大。在本文中,我们提出了一种全面的医疗数据挖掘方法,包括具有火花(BLR算法)的数据预处理和基于袋的逻辑回归,这是为了更好地与火花兼容性,快速并行计算框架。实验结果表明,尽管BLR算法的持续时间比Logistic回归(LR)需要多于LR的持续时间,但对于其他常见评估指标,比LR高2.12%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号