A method is presented to filter the output of a word recognitionalgorithm, which may contain errors, to locate decisions that should becorrect with a high degree of certainty. The algorithm uses the outputof a word recognition system and techniques used in informationretrieval to characterize a free-text document database to locate a setof documents that have topics which are similar to that of the inputdocument. The vocabulary from these similar documents is then used tolocate the correct word recognition decisions. Experimental results showthat a subset of the word recognition decisions for an input documentcan be located that are between 90 and 99% correct. The subset locatedby this method can be used to drive other recognition processes appliedto the rest of the text
展开▼