首页>
外国专利>
System, method and computer program product for performing unstructured information management and automatic text analysis, including an annotation inverted file system facilitating indexing and searching
System, method and computer program product for performing unstructured information management and automatic text analysis, including an annotation inverted file system facilitating indexing and searching
Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. The data processing system includes a token inverted file system storing tokens obtained by at least one tokenizer from document data. An annotation inverted file system stores annotations, a list of one or more occurrences of each annotation, and, for each listed occurrence, a set comprised of at least two token locations spanned by the respective annotation.
展开▼