...
首页> 外文期刊>Journal of Information Science >Cross-language patent matching via an international patent classification-based concept bridge
【24h】

Cross-language patent matching via an international patent classification-based concept bridge

机译:通过基于国际专利分类的概念桥进行跨语言专利匹配

获取原文
获取原文并翻译 | 示例
           

摘要

Patent documents with sophisticated technical information are valuable for developing new technologies and products. They can be written in almost any language, leading to language barrier problems during retrieval. Traditionally, cross-language information retrieval and cross-language document matching have used text-translation-based or index-set-mapping methods. There are several challenges to the traditional methods, however, such as difficulties with natural language translation, complications owing to bilingual or multilingual translations (translating between two or more than two languages), and the unavailability of a parallel dual-language document set This study offers a new and robust solution to cross-language patent document matching: the International Patent Classification (IPC) based concept bridge approach. The proposed method applies Latent Semantic Indexing to extract concepts from each set of patent documents and utilizes the IPC codes to construct a cross-language mediator that expresses patent documents in different languages. Experiments were carried out to demonstrate the performance of the proposed method. There were 3000 English patents and 3000 Chinese patents gathered as training documents from the United States Patent and Trademark Office and the Taiwan Intellectual Property Office, respectively. Another 30 English patents and another 30 Chinese patents were collected to be query patents. Finally, evaluations using an objective measure and subjective judgement were conducted to prove the feasibility and effectiveness of our method. The results show that our method out-performs the traditional text-translation methods.
机译:带有复杂技术信息的专利文件对于开发新技术和产品很有价值。它们几乎可以用任何一种语言编写,导致检索过程中出现语言障碍。传统上,跨语言信息检索和跨语言文档匹配使用基于文本翻译或索引集映射的方法。然而,传统方法面临一些挑战,例如自然语言翻译困难,双语或多语言翻译(两种或两种以上语言之间的翻译)所导致的复杂性以及并行双语言文档集的不可用。为跨语言专利文献匹配提供了一种新的,强大的解决方案:基于国际专利分类(IPC)的概念桥方法。所提出的方法应用潜在语义索引从每组专利文档中提取概念,并利用IPC代码来构建一种跨语言的中介程序,以不同的语言表达专利文件。实验进行了证明该方法的性能。作为培训文件,分别从美国专利商标局和台湾知识产权局收集了3000项英语专利和3000项中国专利作为培训文件。收集了另外30项英国专利和30项中国专利作为查询专利。最后,通过客观测量和主观判断进行评估,以证明该方法的可行性和有效性。结果表明,我们的方法优于传统的文本翻译方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号