Automatic Extraction and Resolution of Bibliographical References in Patent Documents

机译：专利文献中参考文献参考文献的自动提取和解析

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes experiments with Conditional Random Fields (CRF) for extracting bibliographical references in patent documents. CRF are used for performing extraction and parsing tasks which are expressed as sequence tagging problems. The automatic recognition covers references to other patent documents and to scholarship publications which are both characterized by a strong variability of contexts and patterns. Our work is not limited to the extraction of reference blocks but also includes fine-grained parsing and the resolution of the bibliographical references based on data normalization and the access to different online bibliographical services. For these different tasks, CRF models surpass significantly existing rule-based algorithms and other machine learning techniques, resulting more particularly in a very high performance for patent reference extractions with a reduction of approx. 75% of the error rate compared to previous works.

机译：本文介绍了使用条件随机场（CRF）进行实验以提取专利文献中的参考书目。 CRF用于执行提取和解析任务，这些任务表示为序列标记问题。自动识别包括对其他专利文件和奖学金出版物的引用，它们的特征都在于上下文和模式的强烈可变性。我们的工作不仅限于参考块的提取，还包括基于数据规范化和对不同在线书目服务的访问的细粒度解析和书目参考的解析。对于这些不同的任务，CRF模型大大超越了现有的基于规则的算法和其他机器学习技术，尤其是在专利参考文献提取方面表现出了很高的性能，减少了约5％。与以前的作品相比，错误率达到75％。

著录项

来源
《Advances in multidisciplinary retrieval》|2010年|p.120-135|共16页
会议地点 Vienna(AT);Vienna(AT)
作者
Patrice Lopez;
展开▼
作者单位

@@;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Patent Issued for Phrase-Based Document Clustering with Automatic Phrase Extraction [J] . Journal of Engineering . 2013,第12期

机译：具有自动短语提取功能的基于短语的文档聚类已颁发专利
2. Patent Issued for Image Forming System Capable of Causing Document Box Information of the Printer Driver to Automatically Adjust to a Change in the Document Box Information Adjust to a Change in the Document Box Information That Is Stored in an Image [J] . Journal of Engineering . 2013,第13期

机译：授予图像形成系统的专利，该图像形成系统能够使打印机驱动程序的文件夹信息自动适应于文件夹信息的变化，适应于存储在图像中的文件夹信息的变化
3. Traces of Prior Art: An analysis of non-patent references found in patent documents [J] . Callaert J, Van Looy B, Verbeek A, Scientometrics . 2006,第1期

机译：现有技术的痕迹：对专利文件中非专利参考文献的分析
4. Automatic Extraction and Resolution of Bibliographical References in Patent Documents [C] . Patrice Lopez Information Retrieval Facility Conference . 2010

机译：专利文献中的书目参考的自动提取与解决
5. Coreference, cross-document coreference, and information extraction methodologies. [D] . Bagga, Amit. 1998

机译：共指，跨文档共指和信息提取方法。
6. Boosting automatic event extraction from the literature using domain adaptation and coreference resolution [O] . Makoto Miwa, Paul Thompson, Sophia Ananiadou -1

机译：使用域自适应和共指解析来促进从文献中自动提取事件
7. Traces of prior art. An analysis of non-patent references found in patent documents [O] . Callaert J, Van Looy Bart, Verbeek Arnold, 2005

机译：现有技术的痕迹。专利文件中非专利参考文献的分析

Automatic Extraction and Resolution of Bibliographical References in Patent Documents

摘要

著录项

相似文献

相关主题

期刊订阅