Support patient search on pathology reports with interactive online learning based data extraction

Shuai Zheng; James J. Lu; * Christina Appin; Daniel Brat; Fusheng Wang; *

首页> 外文期刊>Journal of Pathology Informatics >Support patient search on pathology reports with interactive online learning based data extraction

【24h】

Support patient search on pathology reports with interactive online learning based data extraction

机译：通过基于交互式在线学习的数据提取，支持患者对病理报告的搜索

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background:Structural reporting enables semantic understanding and prompt retrieval of clinical findings about patients. While synoptic pathology reporting provides templates for data entries, information in pathology reports remains primarily in narrative free text form. Extracting data of interest from narrative pathology reports could significantly improve the representation of the information and enable complex structured queries. However, manual extraction is tedious and error-prone, and automated tools are often constructed with a fixed training dataset and not easily adaptable. Our goal is to extract data from pathology reports to support advanced patient search with a highly adaptable semi-automated data extraction system, which can adjust and self-improve by learning from a user's interaction with minimal human effort.Methods:We have developed an online machine learning based information extraction system called IDEAL-X. With its graphical user interface, the system's data extraction engine automatically annotates values for users to review upon loading each report text. The system analyzes users’ corrections regarding these annotations with online machine learning, and incrementally enhances and refines the learning model as reports are processed. The system also takes advantage of customized controlled vocabularies, which can be adaptively refined during the online learning process to further assist the data extraction. As the accuracy of automatic annotation improves overtime, the effort of human annotation is gradually reduced. After all reports are processed, a built-in query engine can be applied to conveniently define queries based on extracted structured data.Results:We have evaluated the system with a dataset of anatomic pathology reports from 50 patients. Extracted data elements include demographical data, diagnosis, genetic marker, and procedure. The system achieves F-1 scores of around 95% for the majority of tests.Conclusions:Extracting data from pathology reports could enable more accurate knowledge to support biomedical research and clinical diagnosis. IDEAL-X provides a bridge that takes advantage of online machine learning based data extraction and the knowledge from human's feedback. By combining iterative online learning and adaptive controlled vocabularies, IDEAL-X can deliver highly adaptive and accurate data extraction to support patient search.

机译：背景：结构报告可实现语义理解并迅速检索有关患者的临床发现。虽然天气病理报告提供了数据录入模板，但病理报告中的信息主要保持叙述性自由文本形式。从叙述性病理报告中提取感兴趣的数据可以显着改善信息的表示方式并实现复杂的结构化查询。但是，手动提取很繁琐且容易出错，并且自动化工具通常使用固定的训练数据集构建并且不容易适应。我们的目标是从病理学报告中提取数据，以使用高度适应性强的半自动化数据提取系统支持高级患者搜索，该系统可以通过以最少的人工从用户的交互中学习来进行调整和自我完善。基于机器学习的信息提取系统，称为IDEAL-X。通过其图形用户界面，系统的数据提取引擎可自动为值添加注释，以供用户在加载每个报告文本时进行查看。该系统通过在线机器学习分析用户对这些注释的更正，并在处理报告时逐步增强和完善学习模型。该系统还利用了定制的受控词汇表，这些词汇表可以在在线学习过程中进行自适应调整，以进一步帮助数据提取。随着自动注释的准确性随着时间的推移而提高，人工注释的工作量逐渐减少。在处理完所有报告后，可以使用内置的查询引擎来根据提取的结构化数据方便地定义查询。结果：我们已使用来自50位患者的解剖病理报告数据集对系统进行了评估。提取的数据元素包括人口统计数据，诊断，遗传标记和程序。该系统在大多数测试中均达到约95％的F-1分数。结论：从病理报告中提取数据可以提供更准确的知识，以支持生物医学研究和临床诊断。 IDEAL-X提供了一个桥梁，该桥梁利用了基于在线机器学习的数据提取和来自人类反馈的知识。通过将迭代在线学习与自适应控制词汇相结合，IDEAL-X可以提供高度自适应且准确的数据提取以支持患者搜索。

著录项

来源
《Journal of Pathology Informatics》 |2015年第1期|共页
作者
Shuai Zheng; James J. Lu; * Christina Appin; Daniel Brat; Fusheng Wang; *;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类病理学;
关键词

相似文献

外文文献
中文文献
专利

1. Learning From Science News via Interactive and Animated Data Visualizations: An Investigation Combining Eye Tracking, Online Survey, and Cued Retrospective Reporting [J] . Greussing Esther, Kessler Sabrina Heike, Boomgaarden Hajo G. Science Communication . 2020,第6期

机译：通过互动和动画数据可视化学习科学新闻：调查结合眼跟踪，在线调查和追溯报告
2. An Educational Data Mining Approach to Explore The Effect of Using Interactive Supporting Features in an LMS for Overall Performance Within an Online Learning Environment [J] . Ashwaq Al-Musharraf, Mona Alkhattabi International journal of computer science and network security . 2016,第3期

机译：一种教育性数据挖掘方法，探索在在线学习环境中在LMS中使用交互式支持功能对整体绩效的影响
3. Can text-search methods of pathology reports accurately identify patients with rectal cancer in large administrative databases? [J] . Reilly P Musselman, Deanna Rothwell, Rebecca C Auer, Journal of Pathology Informatics . 2018,第1期

机译：病理报告的文本搜索方法能否在大型管理数据库中准确识别直肠癌患者？
4. Data Mining Session-Based Patient Reported Outcomes (PROs) in a Mental Health Setting: Toward Data-Driven Clinical Decision Support and Personalized Treatment [C] . Bennett Casey, Doub Thomas, Bragg April, 2011 First IEEE International Conference on Healthcare Informatics, Imaging and Systems Biology . 2011

机译：在精神健康环境中基于数据挖掘会话的患者报告结果（PRO）：寻求数据驱动的临床决策支持和个性化治疗
5. Scaling the Technology Opportunity Analysis text data mining methodology: Data extraction, cleaning, online analytical processing analysis, and reporting of large multi-source datasets. [D] . George, Richard Peyton. 2006

机译：扩展技术机会分析文本数据挖掘方法：数据提取，清理，在线分析处理分析以及大型多源数据集的报告。
6. Colander: A probability-based support vector machine-learning algorithm for automatic screening for CID spectra of phosphopeptides prior to database search [O] . Bingwen Lu, Cristian I. Ruse, John R. Yates III -1

机译：Colander：一种基于概率的支持向量机学习算法用于在数据库搜索之前自动筛选磷酸肽的CID光谱
7. Support patient search on pathology reports with interactive online learning based data extraction [O] . Shuai Zheng, James J Lu, Christina Appin, 2015

机译：通过基于交互式在线学习的数据提取，支持患者对病理报告的搜索

Support patient search on pathology reports with interactive online learning based data extraction

摘要

著录项

相似文献

相关主题

期刊订阅