首页> 外文期刊>Journal of digital imaging: the official journal of the Society for Computer Applications in Radiology >Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification
【24h】

Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification

机译:自动检测放射学报告,需要使用自然语言处理的后续成像功能工程和机器学习分类

获取原文
获取原文并翻译 | 示例
           

摘要

While radiologists regularly issue follow-up recommendations, our preliminary research has shown that anywhere from 35 to 50% of patients who receive follow-up recommendations for findings of possible cancer on abdominopelvic imaging do not return for follow-up. As such, they remain at risk for adverse outcomes related to missed or delayed cancer diagnosis. In this study, we develop an algorithm to automatically detect free text radiology reports that have a follow-up recommendation using natural language processing (NLP) techniques and machine learning models. The data set used in this study consists of 6000 free text reports from the author's institution. NLP techniques are used to engineer 1500 features, which include the most informative unigrams, bigrams, and trigrams in the training corpus after performing tokenization and Porter stemming. On this data set, we train naive Bayes, decision tree, and maximum entropy models. The decision tree model, with an F1 score of 0.458 and accuracy of 0.862, outperforms both the naive Bayes (F1 score of 0.381) and maximum entropy (F1 score of 0.387) models. The models were analyzed to determine predictive features, with term frequency of n-grams such as "renal neoplasm" and "evalu with enhanc" being most predictive of a follow-up recommendation. Key to maximizing performance was feature engineering that extracts predictive information and appropriate selection of machine learning algorithms based on the feature set.
机译:虽然放射科医生定期发出后续建议,但我们的初步研究表明,从35%到50%的患者中接受可能在腹腔内成像的可能癌症的调查结果的后续建议,不会返回随访。因此,它们仍然存在与错过或延迟癌症诊断有关的不利结果的风险。在本研究中,我们开发了一种算法,可以自动检测使用自然语言处理(NLP)技术和机器学习模型的随访建议的免费文本放射学报告。本研究中使用的数据集由作者机构的6000个免费文本报告组成。 NLP技术用于工程1500个功能,包括在执行标记化和流行者杆后训练语料库中最具信息丰富的Unigrams,Bigrams和Trigrams。在此数据集上,我们训练天真的贝父,决策树和最大熵模型。决策树模型,F1得分为0.458,精度为0.862,优于幼稚贝叶斯(F1得分为0.381)和最大熵(F1得分为0.387)模型。分析模型以确定预测特征,术语频率为n-grams,例如“肾肿瘤”和“具有增强”的“评估”是最预测的后续推荐。 Key to maximizing performance was feature engineering that extracts predictive information and appropriate selection of machine learning algorithms based on the feature set.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号