...
首页> 外文期刊>Artificial intelligence in medicine >Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods
【24h】

Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods

机译:应用不规则和不平衡数据通过可视化和特征选择方法预测糖尿病肾病

获取原文
获取原文并翻译 | 示例
           

摘要

Objective: Diabetic nephropathy is damage to the kidney caused by diabetes mellitus. It is a common complication and a leading cause of death in people with diabetes. However, the decline in kidney function varies considerably between patients and the determinants of diabetic nephropathy have not been clearly identified. Therefore, it is very difficult to predict the onset of diabetic nephropathy accurately with simple statistical approaches such as t-test or x~2-test. To accurately predict the onset of diabetic nephropathy, we applied various machine learning techniques to irregular and unbalanced diabetes dataset, such as support vector machine (SVM) classification and feature selection methods. Visualization of the risk factors was another important objective to give physicians intuitive information on each patient's clinical pattern. Methods and materials: We collected medical data from 292 patients with diabetes and performed preprocessing to extract 184 features from the irregular data. To predict the onset of diabetic nephropathy, we compared several classification methods such as logistic regression, SVM, and SVM with a cost sensitive learning method. We also applied several feature selection methods to remove redundant features and improve the classification performance. For risk factor analysis with SVM classifiers, we have developed a new visualization system which uses a nomogram approach. Results: Linear SVM classifiers combined with wrapper or embedded feature selection methods showed the best results. Among the 184 features, the classifiers selected the same 39 features and gave 0.969 of the area under the curve by receiver operating characteristics analysis. The visualization tool was able to present the effect of each feature on the decision via graphical output. Conclusions: Our proposed method can predict the onset of diabetic nephropathy about 2-3 months before the actual diagnosis with high prediction performance from an irregular and unbalanced dataset, which statistical methods such as t-test and logistic regression could not achieve. Additionally, the visualization system provides physicians with intuitive information for risk factor analysis. Therefore, physicians can benefit from the automatic early warning of each patient and visualize risk factors, which facilitate planning of effective and proper treatment strategies.
机译:目的:糖尿病肾病是糖尿病引起的肾脏损害。它是糖尿病患者的常见并发症和主要死亡原因。然而,患者之间肾脏功能的下降差异很大,糖尿病肾病的决定因素尚未明确。因此,很难通过简单的统计学方法如t检验或x〜2检验来准确预测糖尿病肾病的发作。为了准确预测糖尿病性肾病的发作,我们将各种机器学习技术应用于不规则和不平衡的糖尿病数据集,例如支持向量机(SVM)分类和特征选择方法。可视化风险因素是向医生提供有关每个患者临床模式的直观信息的另一个重要目标。方法和材料:我们收集了292名糖尿病患者的医学数据,并进行了预处理以从不规则数据中提取184个特征。为了预测糖尿病性肾病的发作,我们将几种分类方法(如逻辑回归,支持向量机和支持向量机)与成本敏感型学习方法进行了比较。我们还应用了几种特征选择方法来删除冗余特征并提高分类性能。为了使用SVM分类器进行风险因素分析,我们开发了一种使用列线图方法的新可视化系统。结果:线性SVM分类器结合包装器或嵌入式特征选择方法显示出最佳结果。在184个特征中,分类器选择了相同的39个特征,并通过接收器工作特性分析给出了曲线下的0.969面积。可视化工具能够通过图形输出显示每个功能对决策的影响。结论:我们提出的方法可以在不实际和不平衡的数据集中预测实际诊断之前约2到3个月的糖尿病肾病的发作,具有较高的预测性能,而t检验和logistic回归等统计方法无法实现。此外,可视化系统还为医师提供了用于风险因素分析的直观信息。因此,医生可以从每个患者的自动预警中受益,并可视化风险因素,从而有助于规划有效和适当的治疗策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号