首页> 外国专利> System, method and computer program product for performing unstructured information management and automatic text analysis, including an annotation inverted file system facilitating indexing and searching

System, method and computer program product for performing unstructured information management and automatic text analysis, including an annotation inverted file system facilitating indexing and searching

机译:用于执行非结构化信息管理和自动文本分析的系统,方法和计算机程序产品,包括便于索引和搜索的注释倒排文件系统

摘要

Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. The data processing system includes a token inverted file system storing tokens obtained by at least one tokenizer from document data. An annotation inverted file system stores annotations, a list of one or more occurrences of each annotation, and, for each listed occurrence, a set comprised of at least two token locations spanned by the respective annotation.
机译:公开了一种用于非结构化信息管理系统(UIMS)的系统架构,组件和搜索技术。 UIMS可以作为中间件提供,用于在各种信息源上有效管理和交换非结构化信息。该体系结构通常包括搜索引擎,数据存储,包含流水线文档注释器和各种适配器的分析引擎。搜索技术利用了两级搜索技术。该数据处理系统包括存储由至少一个令牌生成器从文档数据获得的令牌的令牌反转文件系统。注释倒排文件系统存储注释,每个注释的一个或多个出现的列表以及对于每个列出的出现的集合,该集合包括至少两个由相应注释跨越的令牌位置。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号