首页> 外国专利> System and method for facts extraction and domain knowledge repository creation from unstructured and semi-structured documents

System and method for facts extraction and domain knowledge repository creation from unstructured and semi-structured documents

机译:从非结构化和半结构化文档中提取事实和创建领域知识库的系统和方法

摘要

Provided are methods and systems that extract facts of unstructured documents and build an oracle for various domains. The present invention addresses the problem of efficient finding and extraction of facts about a particular subject domain from semi-structured and unstructured documents, makes inferences of new facts from the extracted facts and the ways of verification of the facts, thus becoming a source of knowledge about the domain to be effectively queried. The methods and systems can also extract temporal information from unstructured and semi-structured documents, and can find and extract dynamically generated documents from Deep or Dynamic Web.
机译:提供的方法和系统可以提取非结构化文档的事实,并为各个领域构建一个预言机。本发明解决了从半结构化和非结构化文档中有效查找和提取关于特定主题领域的事实的问题,从提取的事实中推论出新的事实以及对事实进行验证的方式,从而成为知识的来源关于要有效查询的域。该方法和系统还可以从非结构化和半结构化文档中提取时间信息,并且可以从深度或动态Web查找和提取动态生成的文档。

著录项

  • 公开/公告号US8682674B1

    专利类型

  • 公开/公告日2014-03-25

    原文格式PDF

  • 申请/专利权人 GLENBROOK NETWORKS;

    申请/专利号US201313802411

  • 发明设计人 EDWARD KOMISSARCHIK;JULIA KOMISSARCHIK;

    申请日2013-03-13

  • 分类号G10L15/00;G06F17/27;

  • 国家 US

  • 入库时间 2022-08-21 16:00:57

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号