Patent 10896292 was granted and assigned to First American Financial (company) on January, 2021 by the United States Patent and Trademark Office.
Implementations of the disclosure are directed to OCR error correction systems and methods. In some implementations, a method comprises: obtaining, at a computing device, optical character recognition (OCR) text extracted from a document image, the text comprising a token; searching, at the computing device, based on a token bigram determined from the token and a mapping between words in a corpus and a corpus bigram set comprised of unique bigrams from the beginning or ending of the words in the corpus, the corpus for a best word to replace the token; and replacing, at the computing device, the token with the best word.