Productive and safe Rust bindings/wrappers for Tesseract and Leptonica.
Make sure you have clang, Leptonica and Tesseract installed.
For Ubuntu user:
sudo apt-get install libleptonica-dev libtesseract-dev clang
You will also need to install tesseract language data based on your OCR needs:
sudo apt-get install tesseract-ocr-eng
let mut lt = leptess::LepTess::new(None, "eng").unwrap();
lt.set_image("path/to/page.bmp");
println!("{}", lt.get_utf8_text().unwrap());
For more examples, see docs and examples
directory.
To run demos in examples
directory, try:
cargo run --example low_level_ocr_full_page
To run tests, you will need at Tesseract 4.x to match what we have in
tests/tessdata/eng.traineddata
. See CircleCI config to see how to replicate
the setup.