首页>
外国专利>
Method and apparatus for document image indexing and retrieval using multi-level document image structure and local features
Method and apparatus for document image indexing and retrieval using multi-level document image structure and local features
展开▼
机译:使用多层文档图像结构和局部特征进行文档图像索引和检索的方法和装置
展开▼
页面导航
摘要
著录项
相似文献
摘要
An image based document index and retrieval method is described. During document indexing, each source document is analyzed to generate index information at document, page, region and unit levels. Region and unit level index information is generated by segmenting each text region into units, constructing unit length or unit density histograms, and analyzing the units in a few most frequent bins of the histogram. The index information and the source document images are stored in a database. During document retrieval, a target document is analyzed to generate target index information in the same way as during document indexing. The target index information is compared to stored index information in a progressive manner (from higher to lower levels) to identify source documents with index information that matches the target index information. Fuzzy logic is used in the comparison steps to increase the robustness of the document retrieval.
展开▼