首页> 外文期刊>Journal of the American Medical Informatics Association : >Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives
【24h】

Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives

机译:结合规则和机器学习以从临床叙事中提取时间表达和事件

获取原文
获取原文并翻译 | 示例
           

摘要

Objective: Identification of clinical events (eg, problems, tests, treatments) and associated temporal expressions (eg, dates and times) are key tasks in extracting and managing data from electronic health records. As part of the i2b2 2012 Natural Language Processing for Clinical Data challenge, we developed and evaluated a system to automatically extract temporal expressions and events from clinical narratives. The extracted temporal expressions were additionally normalized by assigning type, value, and modifier. Materials and methods: The system combines rulebased and machine learning approaches that rely on morphological, lexical, syntactic, semantic, and domainspecific features. Rule-based components were designed to handle the recognition and normalization of temporal expressions, while conditional random fields models were trained for event and temporal recognition. Results: The system achieved micro F scores of 90% for the extraction of temporal expressions and 87% for clinical event extraction. The normalization component for temporal expressions achieved accuracies of 84.73% (expression's type), 70.44% (value), and 82.75% (modifier). Discussion: Compared to the initial agreement between human annotators (87-89%), the system provided comparable performance for both event and temporal expression mining. While (lenient) identification of such mentions is achievable, finding the exact boundaries proved challenging. Conclusions: The system provides a state-of-the-art method that can be used to support automated identification of mentions of clinical events and temporal expressions in narratives either to support the manual review process or as a part of a large-scale processing of electronic health databases.
机译:目的:识别临床事件(例如问题,测试,治疗)和相关的时间表达(例如日期和时间)是从电子健康记录中提取和管理数据的关键任务。作为针对临床数据的i2b2 2012自然语言处理挑战的一部分,我们开发并评估了一种从临床叙事中自动提取时间表达和事件的系统。通过分配类型,值和修饰符,还对提取的时间表达式进行了规范化。材料和方法:该系统结合了基于规则和机器学习的方法,这些方法依赖于形态,词汇,句法,语义和特定领域的功能。设计基于规则的组件来处理时间表达的识别和规范化,同时对条件随机字段模型进行事件和时间识别的训练。结果:系统在时间表达的提取中获得了90%的微F评分,在临床事件提取中获得了87%的微F评分。时间表达式的归一化组件的准确度达到84.73%(表达式的类型),70.44%(值)和82.75%(修饰符)。讨论:与人类注释者之间的初始协议(87-89%)相比,该系统为事件和时间表达挖掘提供了可比的性能。虽然可以(宽大)地识别此类提及,但找到确切的界限被证明是具有挑战性的。结论:该系统提供了一种先进的方法,可用于支持对叙事中的临床事件和时间表达的提及的自动识别,以支持手动审核过程或作为大规模处理的一部分。电子保健数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号