Generate sets of words tesseract is likely to find ambiguous
ambiguous_words [-l lang] TESSDATADIR WORDLIST AMBIGUOUSFILE
ambiguous_words(1) runs Tesseract in a special mode, and for each word in word list, produces a set of words which Tesseract thinks might be ambiguous with it. TESSDATADIR must be set to the absolute path of a directory containing tessdata/lang.traineddata.
Copyright (C) 2012 Google, Inc. Licensed under the Apache License, Version 2.0
The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995) and Google (2006-present).