...
首页> 外文期刊>Expert systems with applications >Machine learning model for diagnostic method prediction in parasitic disease using clinical information
【24h】

Machine learning model for diagnostic method prediction in parasitic disease using clinical information

机译:临床信息寄生疾病诊断方法预测的机器学习模型

获取原文
获取原文并翻译 | 示例
           

摘要

Diagnosing a parasitic disease is a very difficult job in clinical practice. In this study, we constructed a machine learning model for diagnosis prediction using patient information. First, we diagnosed whether a patient has a parasitic disease. Next, we predicted the proper diagnosis method among the six types of diagnostic terms (biopsy, endoscopy, microscopy, molecular, radiology, and serology) if the patient has a parasitic disease. To make the datasets, we extracted patient information from PubMed abstracts from 1956 to 2019. We then used two datasets: the prediction for parasite-infected patient dataset (N = 8748) and the prediction for diagnosis method dataset (N = 3780). We then compared four machine learning models: support vector machine, random forest, multi-layered perceptron, and gradient boosting. To solve the data imbalance problem, the synthetic minority over-sampling technique and TomekLinks were used. In the parasite-infected patient dataset, the random forest, random forest with synthetic minority over-sampling technique, gradient boosting, gradient boosting with synthetic minority over-sampling technique, and gradient boosting with TomekLinks demonstrated the best performances (AUC: 79%). In predicting the diagnosis method dataset, gradient boosting with synthetic minority over-sampling technique was the best model (AUC: 87%). For the class prediction, gradient boosting demonstrated the best performances in biopsy (AUC: 88%). In endoscopy (AUC: 94%), molecular (AUC: 90%), and radiology (AUC: 88%), gradient boosting with synthetic minority over-sampling technique demonstrated the best performance. Random forest demonstrated the best performances in microscopy (AUC: 82%) and serology (AUC: 85%). We calculated feature importance using gradient boosting; age was the highest feature importance. In conclusion, this study demonstrated that gradient boosting with synthetic minority over-sampling technique can predict a parasitic disease and serve as a promising diagnosis tool for binary classification and multi-classification schemes.
机译:诊断寄生疾病是临床实践中的一项非常困难的工作。在这项研究中,我们构建了一种使用患者信息诊断预测的机器学习模型。首先,我们确诊患者是否具有寄生疾病。接下来,如果患者具有寄生疾病,我们预测了六种类型的诊断术语(活检,内窥镜,显微镜,分子,放射学和血清学)之间的适当诊断方法。为了制作数据集,我们从1956年到2019年从PubMed摘要中提取了患者信息。然后我们使用了两个数据集:寄生虫感染患者数据集的预测(n = 8748)和诊断方法数据集的预测(n = 3780)。然后我们比较了四台机器学习模型:支持向量机,随机森林,多层的感知和梯度提升。为了解决数据不平衡问题,使用了合成少数群体过采样技术和TomeKlinks。在寄生虫感染的患者数据集中,随机森林,随机林,具有合成少数群体的过采样技术,梯度提升,梯度提升与合成少数群体过采样技术,与Tomeklinks的渐变提升展示了最佳表演(AUC:79%) 。在预测诊断方法数据集中,用合成少数群体过采样技术梯度提升是最佳模型(AUC:87%)。对于类预测,梯度提升证明了活组织检查(AUC:88%)中的最佳性能。在内窥镜检查(AUC:94%)中,分子(AUC:90%)和放射学(AUC:88%),用合成少数群体过采样技术的梯度提升表明了最佳性能。随机森林证明了显微镜(AUC:82%)和血清学中最佳性能(AUC:85%)。我们使用渐变升压计算了特征重要性;年龄是最重要的重要性。总之,本研究表明,具有合成少数群体过采样技术的梯度升高可以预测寄生疾病,并用作二进制分类和多分类方案的有希望的诊断工具。

著录项

  • 来源
    《Expert systems with applications》 |2021年第12期|115658.1-115658.11|共11页
  • 作者单位

    Seoul Natl Univ Dept Trop Med & Parasitol Coll Med Seoul 03080 South Korea|Inst Endem Dis Seoul 03080 South Korea;

    Yonsei Univ Dept Pharmacol Coll Med Seoul 03722 South Korea|Yonsei Univ Severance Biomed Sci Inst Coll Med Seoul 03722 South Korea;

    Seoul Natl Univ Dept Trop Med & Parasitol Coll Med Seoul 03080 South Korea|Inst Endem Dis Seoul 03080 South Korea|Seoul Natl Univ Bundang Hosp Seongnam 13620 South Korea;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Machine learning; Parasite; Diagnosis; Multi-classification; Binary-classification;

    机译:机器学习;寄生虫;诊断;多分类;二进制分类;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号