首页> 外文会议>International Conference on Language Resources and Evaluation >A Semi-supervised Approach for De-identification of Swedish Clinical Text
【24h】

A Semi-supervised Approach for De-identification of Swedish Clinical Text

机译:瑞典临床文本去鉴定的半监督方法

获取原文

摘要

An abundance of electronic health records (EHR) is produced every day within healthcare. The records possess valuable information for research and future improvement of healthcare. Multiple efforts have been done to protect the integrity of patients while making electronic health records usable for research by removing personally identifiable information in patient records. Supervised machine learning approaches for de-identification of EHRs need annotated data for training, annotations that are costly in time and human resources. The annotation costs for clinical text is even more costly as the process must be carried out in a protected environment with a limited number of annotators who must have signed confidentiality agreements. In this paper is therefore, a semi-supervised method proposed, for automatically creating high-quality training data. The study shows that the method can be used to improve recall from 84.75% to 89.20% without sacrificing precision to the same extent, dropping from 95.73% to 94.20%. The model's recall is arguably more important for de-identification than precision.
机译:丰富的电子健康记录(EHR)每天都在医疗保健中产生。记录具有有价值的研究和未来改善医疗保健的信息。已经完成多项努力来保护患者的完整性,同时通过在患者记录中删除个人身份信息来制作可用于研究的电子健康记录。监督机器学习方法的去识别EHRS需要注释数据进行培训,昂贵的时间和人力资源的注释。临床文本的注释成本甚至更昂贵,因为必须在受保护的环境中进行的过程,其中包含有限的注册人员必须签署机密性协议。因此,本文提出了一种半监督方法,用于自动创建高质量的培训数据。该研究表明,该方法可用于从84.75%的84.75%至89.20%改善召回,而不会在相同程度上牺牲精度,从95.73%降至94.20%。模型的召回比精度更重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号