...
首页> 外文期刊>Sadhana >An improved predictive association rule based classifier using gain ratio and T-test for health care data diagnosis
【24h】

An improved predictive association rule based classifier using gain ratio and T-test for health care data diagnosis

机译:一种改进的基于预测关联规则的分类器,使用增益比和T检验进行医疗数据诊断

获取原文
           

摘要

Health care data diagnosis is a significant task that needs to be executed precisely, which requires much experience and domain-knowledge. Traditional symptoms-based disease diagnosis may perhaps lead to false presumptions. In recent times, Associative Classification (AC), the combination of association rule mining and classification has received attention in health care applications which desires maximum accuracy. Though several AC techniques exist, they lack in generating quality rules for building efficient associative classifier. This paper aims to enhance the accuracy of the existing CPAR (Classification based on Predictive Association Rule) algorithm by generating quality rules using Gain Ratio. Mostly, health care applications deal with high dimensional datasets. Existence of high dimensions causes unfair estimates in disease diagnosis. Dimensionality reduction is commonly applied as a preprocessing step before classification task to improve classifier accuracy. It eliminates redundant and insignificant dimensions by keeping good ones without information loss. In this work, dimensionality reductions by T-test and reduct sets (or simply reducts) are performed as preprocessing step before CPAR and CPAR using Gain Ratio (CPAR-GR) algorithms. An investigation was also performed to determine the impact of T-test and reducts on CPAR and CPAR-GR. This paper synthesizes the existing work carried out in AC, and also discusses the factors that influence the performance of CPAR and CPAR-GR. Experiments were conducted using six health care datasets from UCI machine learning repository. Based on the experiments, CPAR-GR with T-test yields better classification accuracy than CPAR.
机译:卫生保健数据诊断是一项重要任务,需要精确执行,这需要很多经验和领域知识。传统的基于症状的疾病诊断可能会导致错误的推测。近年来,关联分类法(AC),关联规则挖掘和分类的结合在医疗保健应用中引起了人们的关注,这要求最大的准确性。尽管存在几种交流技术,但它们缺乏生成用于构建有效的关联分类器的质量规则。本文旨在通过利用增益比生成质量规则来提高现有CPAR(基于预测关联规则的分类)算法的准确性。通常,医疗保健应用程序处理高维数据集。高尺寸的存在会导致疾病诊断的估计不合理。降维通常用作分类任务之前的预处理步骤,以提高分类器的准确性。它通过保持良好的尺寸而不会丢失信息,从而消除了多余和无关紧要的尺寸。在这项工作中,在进行CPAR和CPAR之前,使用增益比(CPAR-GR)算法,通过T检验和归约集(或简单归约)进行降维是预处理步骤。还进行了一项调查以确定T检验和还原对CPAR和CPAR-GR的影响。本文综合了在AC中进行的现有工作,并讨论了影响CPAR和CPAR-GR性能的因素。使用来自UCI机器学习存储库的六个医疗保健数据集进行了实验。根据实验,采用T检验的CPAR-GR比CPAR具有更好的分类准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号