首页> 外文学位 >Ontology Enrichment from Free-text Clinical Documents: A Comparison of Alternative Approaches.
【24h】

Ontology Enrichment from Free-text Clinical Documents: A Comparison of Alternative Approaches.

机译:自由文本临床文档中的本体论充实:替代方法的比较。

获取原文
获取原文并翻译 | 示例

摘要

While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships, as well as difficulty in updating the ontology as domain knowledge changes. Methodologies developed in the fields of Natural Language Processing (NLP), Information Extraction (IE), Information Retrieval (IR), and Machine Learning (ML) provide techniques for automating the enrichment of ontology from free-text documents. In this dissertation, I extended these methodologies into biomedical ontology development. First, I reviewed existing methodologies and systems developed in the fields of NLP, IR, and IE, and discussed how existing methods can benefit the development of biomedical ontologies. This previously unconducted review was published in the Journal of Biomedical Informatics. Second, I compared the effectiveness of three methods from two different approaches, the symbolic (the Hearst method) and the statistical (the Church and Lin methods), using clinical free-text documents. Third, I developed a methodological framework for Ontology Learning (OL) evaluation and comparison. This framework permits evaluation of the two types of OL approaches that include three OL methods. The significance of this work is as follows: 1) The results from the comparative study showed the potential of these methods for biomedical ontology enrichment. For the two targeted domains (NCIT and RadLex), the Hearst method revealed an average of 21% and 11% new concept acceptance rates, respectively. The Lin method produced a 74% acceptance rate for NCIT; the Church method, 53%. As a result of this study (published in the Journal of Methods of Information in Medicine), many suggested candidates have been incorporated into the NCIT; 2) The evaluation framework is flexible and general enough that it can analyze the performance of ontology enrichment methods for many domains, thus expediting the process of automation and minimizing the likelihood that key concepts and relationships would be missed as domain knowledge evolves.
机译:尽管生物医学信息学界广泛认可领域本体的效用,但有效利用它们仍然存在许多障碍。领域本体的一项重要要求是,它们必须高度覆盖领域概念和概念关系。但是,这些本体的开发通常是手动的,耗时的并且经常容易出错的过程。有限的资源导致缺少概念和关系,以及随着领域知识的变化而难以更新本体。在自然语言处理(NLP),信息提取(IE),信息检索(IR)和机器学习(ML)领域中开发的方法提供了用于自动从自由文本文档中丰富本体的技术。在本文中,我将这些方法论扩展到了生物医学本体开发中。首先,我回顾了在NLP,IR和IE领域开发的现有方法论和系统,并讨论了现有方法如何使生物医学本体论的发展受益。该先前未进行的综述发表在《生物医学信息学杂志》上。其次,我使用临床自由文本文档比较了两种方法(符号方法(赫斯特方法)和统计方法(教堂和林方法))的三种方法的有效性。第三,我开发了一种用于本体学习(OL)评估和比较的方法框架。该框架允许评估包括三种OL方法的两种OL方法。这项工作的意义如下:1)比较研究的结果表明,这些方法对于生物医学本体富集的潜力。对于两个目标领域(NCIT和RadLex),赫斯特方法分别显示出新概念的接受率分别为21%和11%。 Lin方法对NCIT的接受率为74%;教会法,占53%。这项研究的结果(发表于《医学信息方法杂志》上),许多建议的候选人已被纳入NCIT。 2)评估框架足够灵活且通用,可以分析许多领域的本体丰富方法的性能,从而加快了自动化过程,并最大限度地减少了随着领域知识的发展而遗漏关键概念和关系的可能性。

著录项

  • 作者

    Liu, Kaihong.;

  • 作者单位

    University of Pittsburgh.;

  • 授予单位 University of Pittsburgh.;
  • 学科 Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 169 p.
  • 总页数 169
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号