The CLASSE GATOR (CLinical Acronym SenSE disambiGuATOR): A Method for predicting acronym sense from neonatal clinical notes

首页> 外文期刊>International journal of medical informatics >The CLASSE GATOR (CLinical Acronym SenSE disambiGuATOR): A Method for predicting acronym sense from neonatal clinical notes

【24h】

The CLASSE GATOR (CLinical Acronym SenSE disambiGuATOR): A Method for predicting acronym sense from neonatal clinical notes

机译：CLASSE GATOR（临床首字母缩写词SenSE disambiGuATOR）：一种从新生儿临床笔记中预测首字母缩写感的方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Objective: To develop an algorithm for identifying acronym 'sense' from clinical notes without requiring a clinically annotated training set.Materials and Methods: Our algorithm is called CLASSE GATOR: Clinical Acronym SenSE disambiGuATOR. CLASSE GATOR extracts acronyms and definitions from PubMed Central (PMC). A logistic regression model is trained using words associated with specific acronym-definition pairs from PMC. CLASSE GATOR uses this library of acronym-definitions and their corresponding word feature vectors to predict the acronym 'sense' from Beth Israel Deaconess (MIMIC-III) neonatal notes.Results: We identified 1,257 acronyms and 8,287 definitions including a random definition from 31,764 PMC articles on prenatal exposures and 2,227,674 PMC open access articles. The average number of senses (definitions) per acronym was 6.6 (min = 2, max = 50). The average internal 5-fold cross validation was 87.9 % (on PMC). We found 727 unique acronyms (57.29 %) from PMC were present in 105,044 neonatal notes (MIMIC-III). We evaluated the performance of acronym prediction using 245 manually annotated clinical notes with 9 distinct acronyms. CLASSE GATOR achieved an overall accuracy of 63.04 % and outperformed random for 8/9 acronyms (88.89 %) when applied to clinical notes. We also compared our algorithm with UMN's acronym set, and found that CLASSE GATOR outperformed random for 63.46 % of 52 acronyms when using logistic regression, 75.00 % when using Bert and 76.92 % when using BioBert as the prediction algorithm within CLASSE GATOR.Conclusions: CLASSE GATOR is the first automated acronym sense disambiguation method for clinical notes. Importantly, CLASSE GATOR does not require an expensive manually annotated acronym-definition corpus for training.

机译：目的：开发一种无需临床注释训练集即可从临床笔记中识别首字母缩写“ sense”的算法。材料和方法：我们的算法称为CLASSE GATOR：临床首字母缩写SenSE disambiGuATOR。 CLASSE GATOR从PubMed Central（PMC）中提取首字母缩写词和定义。使用与来自PMC的特定首字母缩写词-定义对相关的词来训练逻辑回归模型。 CLASSE GATOR使用该首字母缩写词定义库及其相应的单词特征向量来预测Beth Israel Deaconess（MIMIC-III）新生儿笔记中的首字母缩写词“ sense”。结果：我们确定了1,257个首字母缩写词和8,287个定义，其中包括来自31,764 PMC的随机定义有关产前暴露的文章和2,227,674 PMC开放获取文章。每个首字母缩写词的平均感官（定义）数为6.6（最小= 2，最大= 50）。内部平均5倍交叉验证率为87.9％（在PMC上）。我们在105,044例新生儿笔记（MIMIC-III）中发现了727个来自PMC的独特首字母缩写词（57.29％）。我们使用245种带9个不同首字母缩写的手动注释临床笔记评估了首字母缩写预测的性能。当应用于临床笔记时，CLASSE GATOR的整体准确度达到63.04％，并且优于8/9的首字母缩写词（88.89％）的随机性。我们还将算法与UMN的首字母缩略词集进行了比较，发现CLASSE GATOR在使用Logistic回归时对52个首字母缩略词表现出了63.46％的随机性，在使用Bert的情况下使用CLASSE GATOR的预测算法的使用率为75.00％，在使用Bert的情况下为76.92％。 GATOR是首个针对临床笔记的自动首字母缩写词义消歧方法。重要的是，CLASSE GATOR不需要昂贵的人工注释首字母缩写定义语料就可以进行培训。

著录项

来源
《International journal of medical informatics》 |2020年第5期|104101.1-104101.10|共10页
作者

展开▼
作者单位

Univ Penn Dept Comp Sci Philadelphia PA 19104 USA;

Childrens Hosp Philadelphia Dept Pediat Div Neonatol Philadelphia PA 19104 USA|Univ Penn Perelman Sch Med Philadelphia PA 19104 USA;

Univ Penn Perelman Sch Med Dept Biostat Epidemiol & Informat Philadelphia PA 19104 USA|Univ Penn Inst Biomed Informat Philadelphia PA 19104 USA|Univ Penn Ctr Excellence Environm Toxicol Philadelphia PA 19104 USA|Childrens Hosp Philadelphia Dept Biomed & Hlth Informat Philadelphia PA 19104 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Electronic health records; Natural language processing; Secondary reuse; Transfer learning;

机译：电子健康记录;自然语言处理;二次重用;转移学习;

相似文献

外文文献
中文文献
专利

1. A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources [J] . MoonS., PakhomovS., LiuN., Journal of the American Medical Informatics Association : . 2014,第2期

机译：使用临床注释和医学词典资源创建的临床缩写词和首字母缩写词的有意义清单
2. A new clustering method for detecting rare senses of abbreviations in clinical notes [J] . XuH., WuY., ElhadadN., Journal of biomedical informatics. . 2012,第6期

机译：检测临床笔记中稀有缩写词的新聚类方法
3. Methods for building sense inventories of abbreviations in clinical notes. [J] . Xu H, Stetson PD, Friedman C Journal of the American Medical Informatics Association : . 2009,第1期

机译：在临床笔记中建立缩写感官清单的方法。
4. Kernel Methods for Word Sense Disambiguation and Acronym Expansion [C] . Mahesh Joshi, Ted Pedersen, Richard Maclin, National Conference on Artificial Intelligence(AAAI-06);Innovative Applications of Artificial Intelligence Conference(IAAI-06) . 2006

机译：词义消歧和首字母缩略词扩展的内核方法
5. Automatic word sense disambiguation of acronyms and abbreviations in clinical texts [D] . Moon, Sungrim 2012

机译：临床文本中首字母缩写词和缩写词的自动词义消除
6. A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources [O] . Sungrim Moon, Serguei Pakhomov, Nathan Liu, 2014

机译：使用临床笔记和医学词典资源创建的临床缩写词和首字母缩写的有意义清单
7. A method for harmonization of clinical abbreviation and acronym sense inventories [O] . Lisa V. Grossman, Elliot G. Mitchell, George Hripcsak, 2018

机译：协调临床缩写和缩写义义清单的方法

The CLASSE GATOR (CLinical Acronym SenSE disambiGuATOR): A Method for predicting acronym sense from neonatal clinical notes

摘要

著录项

相似文献

相关主题

期刊订阅