首页> 外文期刊>Complexity >ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition
【24h】

ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition

机译:abioLer:一种基于BERT的阿拉伯生物医学名称实体识别模型

获取原文
       

摘要

The web is being loaded daily with a huge volume of data, mainly unstructured textual data, which increases the need for information extraction and NLP systems significantly. Named-entity recognition task is a key step towards efficiently understanding text data and saving time and effort. Being a widely used language globally, English is taking over most of the research conducted in this field, especially in the biomedical domain. Unlike other languages, Arabic suffers from lack of resources. This work presents a BERT-based model to identify biomedical named entities in the Arabic text data (specifically disease and treatment named entities) that investigates the effectiveness of pretraining a monolingual BERT model with a small-scale biomedical dataset on enhancing the model understanding of Arabic biomedical text. The model performance was compared with two state-of-the-art models (namely, AraBERT and multilingual BERT cased), and it outperformed both models with 85% F1-score.
机译:Web每天正在加载大量数据,主要是非结构化文本数据,这增加了对信息提取和NLP系统的需求显着。 命名实体识别任务是有效地了解文本数据和节省时间和精力的关键步骤。 在全球范围内广泛使用的语言,英语正在接管在该领域中进行的大部分研究,特别是在生物医学领域。 与其他语言不同,阿拉伯语缺乏资源。 这项工作提出了一种基于BERT的模型,以识别阿拉伯语文本数据(具体疾病和治疗实体)中的生物医学命名实体,该模型研究了与小型生物医学数据集预先训练单声道BERT模型的有效性,以提高阿拉伯语的模型理解 生物医学文本。 模型性能与两个最先进的模型进行了比较(即阿拉伯和多语言BERT外壳),并且它表现出85%F1分数的两种型号。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号