首页> 外文期刊>Journal of Pathology Informatics >Support patient search on pathology reports with interactive online learning based data extraction
【24h】

Support patient search on pathology reports with interactive online learning based data extraction

机译:通过基于交互式在线学习的数据提取,支持患者对病理报告的搜索

获取原文
           

摘要

Background:Structural reporting enables semantic understanding and prompt retrieval of clinical findings about patients. While synoptic pathology reporting provides templates for data entries, information in pathology reports remains primarily in narrative free text form. Extracting data of interest from narrative pathology reports could significantly improve the representation of the information and enable complex structured queries. However, manual extraction is tedious and error-prone, and automated tools are often constructed with a fixed training dataset and not easily adaptable. Our goal is to extract data from pathology reports to support advanced patient search with a highly adaptable semi-automated data extraction system, which can adjust and self-improve by learning from a user's interaction with minimal human effort.Methods:We have developed an online machine learning based information extraction system called IDEAL-X. With its graphical user interface, the system's data extraction engine automatically annotates values for users to review upon loading each report text. The system analyzes users’ corrections regarding these annotations with online machine learning, and incrementally enhances and refines the learning model as reports are processed. The system also takes advantage of customized controlled vocabularies, which can be adaptively refined during the online learning process to further assist the data extraction. As the accuracy of automatic annotation improves overtime, the effort of human annotation is gradually reduced. After all reports are processed, a built-in query engine can be applied to conveniently define queries based on extracted structured data.Results:We have evaluated the system with a dataset of anatomic pathology reports from 50 patients. Extracted data elements include demographical data, diagnosis, genetic marker, and procedure. The system achieves F-1 scores of around 95% for the majority of tests.Conclusions:Extracting data from pathology reports could enable more accurate knowledge to support biomedical research and clinical diagnosis. IDEAL-X provides a bridge that takes advantage of online machine learning based data extraction and the knowledge from human's feedback. By combining iterative online learning and adaptive controlled vocabularies, IDEAL-X can deliver highly adaptive and accurate data extraction to support patient search.
机译:背景:结构报告可实现语义理解并迅速检索有关患者的临床发现。虽然天气病理报告提供了数据录入模板,但病理报告中的信息主要保持叙述性自由文本形式。从叙述性病理报告中提取感兴趣的数据可以显着改善信息的表示方式并实现复杂的结构化查询。但是,手动提取很繁琐且容易出错,并且自动化工具通常使用固定的训练数据集构建并且不容易适应。我们的目标是从病理学报告中提取数据,以使用高度适应性强的半自动化数据提取系统支持高级患者搜索,该系统可以通过以最少的人工从用户的交互中学习来进行调整和自我完善。基于机器学习的信息提取系统,称为IDEAL-X。通过其图形用户界面,系统的数据提取引擎可自动为值添加注释,以供用户在加载每个报告文本时进行查看。该系统通过在线机器学习分析用户对这些注释的更正,并在处理报告时逐步增强和完善学习模型。该系统还利用了定制的受控词汇表,这些词汇表可以在在线学习过程中进行自适应调整,以进一步帮助数据提取。随着自动注释的准确性随着时间的推移而提高,人工注释的工作量逐渐减少。在处理完所有报告后,可以使用内置的查询引擎来根据提取的结构化数据方便地定义查询。结果:我们已使用来自50位患者的解剖病理报告数据集对系统进行了评估。提取的数据元素包括人口统计数据,诊断,遗传标记和程序。该系统在大多数测试中均达到约95%的F-1分数。结论:从病理报告中提取数据可以提供更准确的知识,以支持生物医学研究和临床诊断。 IDEAL-X提供了一个桥梁,该桥梁利用了基于在线机器学习的数据提取和来自人类反馈的知识。通过将迭代在线学习与自适应控制词汇相结合,IDEAL-X可以提供高度自适应且准确的数据提取以支持患者搜索。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号