-
Notifications
You must be signed in to change notification settings - Fork 27
How to use ocrevalUAtion
Mike Gerber edited this page Jul 4, 2019
·
14 revisions
Once you have installed a version of ocrevaluation.jar, you can run it as follows:
java -cp ocrevaluation.jar eu.digitisation.Main \
-gt {ground_truth_file} [{encoding}] \
-ocr {ocr_file} [encoding] \
-d {output_directory} [-r {equivalences_file}]
Where:
-
{ground_truth_file} = the full path to a ground truth file. Supported formats: Text, PAGE.
-
{ocr_file} = the full path to an OCR result file. Supported formats: Text, PAGE XML, FineReader10 XML, hOCR HTML
-
{output_directory} = the folder where the report (HTML format) will be generated.
-
{encoding} = the preceding file encoding type (optional).
-
{equivalences_file} = an optional text file describing equivalences between Unicode characters (two sequences, separated by a comma, of hexadecimal code points per line).
Example:
java -cp ocrevaluation.jar eu.digitisation.Main \
-gt groundtruth.xml -ocr ocr.txt utf8 \
-d output -r equivalences.csv