...
首页> 外文期刊>IEICE transactions on information and systems >Extracting Knowledge Entities from Sci-Tech Intelligence Resources Based on BiLSTM and Conditional Random Field
【24h】

Extracting Knowledge Entities from Sci-Tech Intelligence Resources Based on BiLSTM and Conditional Random Field

机译:基于Bilstm和条件随机字段从SCI-Tech Intelligence资源中提取知识实体

获取原文
   

获取外文期刊封面封底 >>

       

摘要

There are many knowledge entities in sci-tech intelligence resources. Extracting these knowledge entities is of great importance for building knowledge networks, exploring the relationship between knowledge, and optimizing search engines. Many existing methods, which are mainly based on rules and traditional machine learning, require significant human involvement, but still suffer from unsatisfactory extraction accuracy. This paper proposes a novel approach for knowledge entity extraction based on BiLSTM and conditional random field (CRF).A BiLSTM neural network to obtain the context information of sentences, and CRF is then employed to integrate global label information to achieve optimal labels. This approach does not require the manual construction of features, and outperforms conventional methods. In the experiments presented in this paper, the titles and abstracts of 20,000 items in the existing sci-tech literature are processed, of which 50,243 items are used to build benchmark datasets. Based on these datasets, comparative experiments are conducted to evaluate the effectiveness of the proposed approach. Knowledge entities are extracted and corresponding knowledge networks are established with a further elaboration on the correlation of two different types of knowledge entities. The proposed research has the potential to improve the quality of sci-tech information services.
机译:SCI-Tech Intelligence资源中有许多知识实体。提取这些知识实体对于建立知识网络,探索知识与优化搜索引擎之间的关系非常重要。许多现有的方法主要基于规则和传统机器学习,需要大量的人类参与,但仍然遭受不令人满意的提取精度。本文提出了一种基于Bilstm和条件随机字段(CRF)的知识实体提取的新方法.A Bilstm神经网络以获得句子的上下文信息,然后采用CRF集成全局标签信息以实现最佳标签。这种方法不需要手动构建功能,并且优于传统方法。在本文提出的实验中,处理了现有的SCI-Tech文献中的20,000个项目的标题和摘要,其中50,243项用于构建基准数据集。基于这些数据集,进行了比较实验,以评估所提出的方法的有效性。提取知识实体,并建立了相应的知识网络,其进一步阐述了两种不同类型的知识实体的相关性。拟议的研究有可能提高科技信息服务的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号