...
首页> 外文期刊>BMC Medical Informatics and Decision Making >Constructing fine-grained entity recognition corpora based on clinical records of traditional Chinese medicine
【24h】

Constructing fine-grained entity recognition corpora based on clinical records of traditional Chinese medicine

机译:基于中医临床记录构建细粒度实体识别基础

获取原文
           

摘要

In this study, we focus on building a fine-grained entity annotation corpus with the corresponding annotation guideline of traditional Chinese medicine (TCM) clinical records. Our aim is to provide a basis for the fine-grained corpus construction of TCM clinical records in future. We developed a four-step approach that is suitable for the construction of TCM medical records in our corpus. First, we determined the entity types included in this study through sample annotation. Then, we drafted a fine-grained annotation guideline by summarizing the characteristics of the dataset and referring to some existing guidelines. We iteratively updated the guidelines until the inter-annotator agreement (IAA) exceeded a Cohen’s kappa value of 0.9. Comprehensive annotations were performed while keeping the IAA value above 0.9. We annotated the 10,197 clinical records in five rounds. Four entity categories involving 13 entity types were employed. The final fine-grained annotated entity corpus consists of 1104 entities and 67,799 tokens. The final IAAs are 0.936 on average (for three annotators), indicating that the fine-grained entity recognition corpus is of high quality. These results will provide a foundation for future research on corpus construction and named entity recognition tasks in the TCM clinical domain.
机译:在这项研究中,我们专注于建立一个细粒度的实体注释语料库与中医(TCM)临床记录的相应注释指南。我们的宗旨是在未来为中医临床记录的细粒度施工提供依据。我们开发了一种四步方法,适合在我们的语料库中建造中医记录。首先,我们通过样本注释确定了本研究中包含的实体类型。然后,我们通过总结数据集的特点并提及一些现有指南,起草了一个细粒度的注释指南。我们迭代地更新了指导方针,直到互联网协议(IAA)超过了科恩的Kappa值为0.9。在将IAA值保持在0.9以上的同时进行综合注释。我们在五轮中注释了10,197名临床记录。使用涉及13个实体类型的四个实体类别。最终的细粒度注释实体语料库包括1104个实体和67,799令牌。最终IAAS平均为0.936(三个注释器),表明细粒度实体识别语料库具有高质量。这些结果将为TCM临床领域的未来研究施工和命名实体识别任务提供基础。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号