首页> 外文会议>Proceedings of 2011 3rd International Conference on Awareness Science and Technology >BFSM: Finite state machine learned as name boundary definer for bio named entity recognition
【24h】

BFSM: Finite state machine learned as name boundary definer for bio named entity recognition

机译:BFSM:有限状态机被学习为生物命名实体识别的名称边界定义器

获取原文
获取原文并翻译 | 示例

摘要

One essential task in automated information extraction for biomedical literature is bio named entity recognition process, which basically defines the boundaries between typical words and technical terms of biomedical domain in particular text data and, classifies them based on the domain knowledge. Due to nature of bio named entity, purely defining boundary of the named entities in text data is still challenging. This paper proposes using the part-of-speech tags of tokens as target observation of name boundary definer tool. We proposed an approach for modeling finite state machine as the boundary definer. Aided by machine learning methods including frequent pattern mining method and Bayesian network, the finite state machine learns on part-of-speech tag of tokens in bio-text data. The finite state machine based on Bayesian network is named BFSM. In addition, we report the influence of part-of-speech tagger tool for learning of BFSM. Experimental results show that the named entity recognition system using the BFSM gives us high accuracy as F-score 85.8.
机译:生物医学文献自动信息提取中的一项基本任务是生物命名实体识别过程,该过程基本上定义了特定文本数据中生物医学领域的典型词与技术术语之间的界限,并根据领域知识对它们进行分类。由于生物命名实体的性质,仅在文本数据中定义命名实体的边界仍然具有挑战性。本文提出使用令牌的词性标签作为名称边界定义器工具的目标观察。我们提出了一种将有限状态机建模为边界定义器的方法。在机器学习方法(包括频繁模式挖掘方法和贝叶斯网络)的辅助下,有限状态机学习生物文本数据中令牌的词性标记。基于贝叶斯网络的有限状态机称为BFSM。此外,我们报告了词性标记工具对BFSM学习的影响。实验结果表明,使用BFSM的命名实体识别系统具有较高的准确度,即F分数85.8。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号