首页> 外文会议>Advances in multidisciplinary retrieval >Automatic Extraction and Resolution of Bibliographical References in Patent Documents
【24h】

Automatic Extraction and Resolution of Bibliographical References in Patent Documents

机译:专利文献中参考文献参考文献的自动提取和解析

获取原文
获取原文并翻译 | 示例

摘要

This paper describes experiments with Conditional Random Fields (CRF) for extracting bibliographical references in patent documents. CRF are used for performing extraction and parsing tasks which are expressed as sequence tagging problems. The automatic recognition covers references to other patent documents and to scholarship publications which are both characterized by a strong variability of contexts and patterns. Our work is not limited to the extraction of reference blocks but also includes fine-grained parsing and the resolution of the bibliographical references based on data normalization and the access to different online bibliographical services. For these different tasks, CRF models surpass significantly existing rule-based algorithms and other machine learning techniques, resulting more particularly in a very high performance for patent reference extractions with a reduction of approx. 75% of the error rate compared to previous works.
机译:本文介绍了使用条件随机场(CRF)进行实验以提取专利文献中的参考书目。 CRF用于执行提取和解析任务,这些任务表示为序列标记问题。自动识别包括对其他专利文件和奖学金出版物的引用,它们的特征都在于上下文和模式的强烈可变性。我们的工作不仅限于参考块的提取,还包括基于数据规范化和对不同在线书目服务的访问的细粒度解析和书目参考的解析。对于这些不同的任务,CRF模型大大超越了现有的基于规则的算法和其他机器学习技术,尤其是在专利参考文献提取方面表现出了很高的性能,减少了约5%。与以前的作品相比,错误率达到75%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号