A Semi-supervised Approach for De-identification of Swedish Clinical Text

机译：瑞典临床文本去鉴定的半监督方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

An abundance of electronic health records (EHR) is produced every day within healthcare. The records possess valuable information for research and future improvement of healthcare. Multiple efforts have been done to protect the integrity of patients while making electronic health records usable for research by removing personally identifiable information in patient records. Supervised machine learning approaches for de-identification of EHRs need annotated data for training, annotations that are costly in time and human resources. The annotation costs for clinical text is even more costly as the process must be carried out in a protected environment with a limited number of annotators who must have signed confidentiality agreements. In this paper is therefore, a semi-supervised method proposed, for automatically creating high-quality training data. The study shows that the method can be used to improve recall from 84.75% to 89.20% without sacrificing precision to the same extent, dropping from 95.73% to 94.20%. The model's recall is arguably more important for de-identification than precision.

机译：丰富的电子健康记录（EHR）每天都在医疗保健中产生。记录具有有价值的研究和未来改善医疗保健的信息。已经完成多项努力来保护患者的完整性，同时通过在患者记录中删除个人身份信息来制作可用于研究的电子健康记录。监督机器学习方法的去识别EHRS需要注释数据进行培训，昂贵的时间和人力资源的注释。临床文本的注释成本甚至更昂贵，因为必须在受保护的环境中进行的过程，其中包含有限的注册人员必须签署机密性协议。因此，本文提出了一种半监督方法，用于自动创建高质量的培训数据。该研究表明，该方法可用于从84.75％的84.75％至89.20％改善召回，而不会在相同程度上牺牲精度，从95.73％降至94.20％。模型的召回比精度更重要。

著录项

来源
《International Conference on Language Resources and Evaluation》|2020年|4444-4450|共7页
会议地点
作者
Hanna Berg; Hercules Dalianis;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
semi-supervised learning; self training; de-identification; Swedish clinical text;

机译：半监督学习;自我培训;去鉴定;瑞典临床文本;

相似文献

外文文献
中文文献
专利

1. Text de-identification for privacy protection: A study of its impact on clinical text information content [J] . Stephane M. Meystre, Oscar Ferrez, F.Jeffrey Friedlin, Journal of biomedical informatics. . 2014,第Null期

机译：用于隐私保护的文本去识别：对临床文本信息内容的影响研究
2. Clinical Text De-identification and Other Large Scale Processing Tasks in Resource Constrained Environments [J] . Richard Jackson, Richard Dobson, Robert Stewart International Journal of Population Data Science . 2017,第1期

机译：资源受限环境中的临床文本识别和其他大规模处理任务
3. Automated de-identification of clinical free-text [J] . Azad Dehghan, Tom Liptrot, Catherine O’hara, International Journal of Population Data Science . 2017,第1期

机译：自动取消临床自由文本的识别
4. Building a De-identification System for Real Swedish Clinical Text Using Pseudonymised Clinical Text [C] . Hanna Berg, Taridzo Chomutare, Hercules Dalianis International workshop on health text mining and information analysis . 2019

机译：使用假名化的临床文本构建真实的瑞典临床文本的去识别系统
5. Fuzzification of Supervised and Semi-Supervised Convolution Neural Networks for Identification of Neutral Text in Sentiment Analysis [D] . ?Najar, Rawan 2020

机译：监督和半监控卷积神经网络的鉴定，用于识别中立文本的情感分析
6. A Cascaded Approach for Chinese Clinical Text De-Identification with Less Annotation Effort [O] . Zhe Jian, Xusheng Guo, Shijian Liu, -1

机译：少标注工作量的中文临床文本识别的级联方法
7. Text de-identification for privacy protection: A study of its impact on clinical text information content [O] . Meystre Stéphane M., Ferrández Óscar, Friedlin F. Jeffrey, 2014

机译：用于隐私保护的文本去识别：对临床文本信息内容的影响研究

A Semi-supervised Approach for De-identification of Swedish Clinical Text

摘要

著录项

相似文献

相关主题

期刊订阅