【24h】

Contributions to Clinical Named Entity Recognition in Portuguese

机译:葡萄牙语对临床命名实体识别的贡献

获取原文

摘要

Having in mind that different languages might present different challenges, this paper presents the following contributions to the area of Information Extraction from clinical text, targeting the Portuguese language: a collection of 281 clinical texts in this language, with manually-annotated named entities: word em-beddings trained in a larger collection of similar texts: results of using BiLSTM-CRF neural networks for named entity recognition on the annotated collection, including a comparison of using in-domain or out-of-domain word embeddings in this task. Although learned with much less data, performance is higher when using in-domain embeddings. When tested in 20 independent clinical texts, this model achieved better results than a model using larger out-of-domain embeddings.
机译:考虑到不同的语言可能带来不同的挑战,本文针对以葡萄牙语为母语的临床文本信息提取领域提出了以下贡献:281种该语言的临床文本的集合,并带有手动注释的命名实体:word在大量类似文本中训练的嵌入:使用BiLSTM-CRF神经网络在带注释的集合上进行命名实体识别的结果,包括在此任务中使用域内或域外词嵌入的比较。尽管了解的数据少得多,但是使用域内嵌入时性能更高。当在20篇独立的临床文献中进行测试时,该模型比使用较大的域外嵌入的模型获得了更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号