首页> 外文期刊>JMIR Medical Informatics >Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
【24h】

Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study

机译:自动语言处理自由文本放射学报告自动化冲程数据提取:仪器验证研究

获取原文
           

摘要

Background Diagnostic neurovascular imaging data are important in stroke research, but obtaining these data typically requires laborious manual chart reviews. Objective We aimed to determine the accuracy of a natural language processing (NLP) approach to extract information on the presence and location of vascular occlusions as well as other stroke-related attributes based on free-text reports. Methods From the full reports of 1320 consecutive computed tomography (CT), CT angiography, and CT perfusion scans of the head and neck performed at a tertiary stroke center between October 2017 and January 2019, we manually extracted data on the presence of proximal large vessel occlusion (primary outcome), as well as distal vessel occlusion, ischemia, hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status (secondary outcomes). Reports were randomly split into training (n=921) and validation (n=399) sets, and attributes were extracted using rule-based NLP. We reported the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the overall accuracy of the NLP approach relative to the manually extracted data. Results The overall prevalence of large vessel occlusion was 12.2%. In the training sample, the NLP approach identified this attribute with an overall accuracy of 97.3% (95.5% sensitivity, 98.1% specificity, 84.1% PPV, and 99.4% NPV). In the validation set, the overall accuracy was 95.2% (90.0% sensitivity, 97.4% specificity, 76.3% PPV, and 98.5% NPV). The accuracy of identifying distal or basilar occlusion as well as hemorrhage was also high, but there were limitations in identifying cerebral ischemia, ASPECTS, and collateral status. Conclusions NLP may improve the efficiency of large-scale imaging data collection for stroke surveillance and research.
机译:背景技术诊断神经血管成像数据在中风研究中很重要,但获得这些数据通常需要费力的手动图表评论。目的我们旨在确定自然语言处理(NLP)方法的准确性,以提取关于血管闭塞的存在和位置的信息以及基于自由文本报告的其他中风相关属性。方法从1320个连续计算断层扫描(CT),CT血管造影和CT血管造影和CT灌注扫描在2017年10月至2019年10月之间的第三级行程中心进行的头部和颈部,我们在近端大容器的存在下手动提取数据闭塞(初级结果),以及远端血管闭塞,缺血,出血,Alberta笔划计划早期CT评分(方面)和抵押身份(二次结果)。报告随机分为训练(n = 921)和验证(n = 399)集,并使用基于规则的NLP提取属性。我们报告了灵敏度,特异性,阳性预测值(PPV),负预测值(NPV),以及NLP方法相对于手动提取数据的整体精度。结果大容器闭塞的总体普及率为12.2%。在训练样本中,NLP方法确定了该属性,整体准确性为97.3%(灵敏度为95.5%,特异性为98.1%,PPV和99.4%NPV)。在验证组中,总体精度为95.2%(敏感性90.0%,特异性为97.4%,76.3%PPV和98.5%NPV)。鉴定远端或基底闭塞以及出血的准确性也很高,但鉴定脑缺血,方面和抵押身份存在局限性。结论NLP可以提高冲程监测和研究的大规模成像数据收集效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号