首页> 外文会议>Conference on Artificial Intelligence in Medicine >Identification of Patient Prescribing Predicting Cancer Diagnosis Using Boosted Decision Trees
【24h】

Identification of Patient Prescribing Predicting Cancer Diagnosis Using Boosted Decision Trees

机译:使用增强决策树确定患者处方预测的癌症诊断

获取原文

摘要

Machine learning has potential to identify patterns in pre-diagnostic prescribing that act as an early signal of cancer diagnosis. Danish studies using classical regression models have shown that prescribing of particular drugs increases in the months prior to lung and colorectal cancer diagnosis. The aim of this case-control study is to assess the potential for machine learning to extend these findings to identify combinations of prescriptions that might act as pre-cancer signals. We use a boosted trees approach to analyse prescriptions data from NHS Business Services Authority linked to English cancer registry data to classify individuals into two classes: cancer patients and controls. We then identify the drugs that contributed the most to the classification decisions in the models. To the best of our knowledge, this is the first study utilising machine learning to find pre-diagnostic primary-care-prescription-related indicators of cancer diagnosis in England. We assess two feature selection approaches using text categorisation methods alone and in combination with clinical domain knowledge. Matched samples of controls (ten controls for each patient) to control for age are used throughout. We train models for matched cohorts of 6,770 lung cancer patients and 5,869 colorectal cancer patients starting the cancer pathway for the first time between January and March 2016. The models outperform classical methods by AUC, AUC-PR, and F_(0.5) score, showing strong potential forusing machine learning to extract signals from this dataset to aid earlier diagnosis. Our findings confirm the Danish studies.
机译:机器学习有潜力在诊断前的处方中识别模式,这些模式可作为癌症诊断的早期信号。丹麦使用经典回归模型进行的研究表明,在肺癌和大肠癌诊断之前的几个月中,特定药物的处方增加。该病例对照研究的目的是评估机器学习扩展这些发现以识别可能充当癌前信号的处方组合的潜力。我们使用增强树方法来分析来自NHS商业服务管理局的处方数据,该数据与英国癌症注册数据链接在一起,以将个人分为两类:癌症患者和对照。然后,我们确定对模型中的分类决策贡献最大的药物。据我们所知,这是第一项利用机器学习来发现英格兰癌症诊断前的预诊处方相关指标的研究。我们单独使用文本分类方法并结合临床领域知识来评估两种功能选择方法。全文使用对照样品(每个患者十个对照)以对照年龄。我们针对2016年1月至2016年3月之间首次启动癌症通路的6,770例肺癌患者和5,869例结直肠癌患者的配对队列进行了训练。该模型的AUC,AUC-PR和F_(0.5)得分均优于经典方法,显示利用机器学习从该数据集中提取信号以帮助早期诊断的强大潜力。我们的发现证实了丹麦的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号