Script to scan a document, clean it up, apply ocr, and output a pdf. Works on ubuntu 16.04 (everything bar recoll installed from default apt repos)
-
buttonpressed.sh originally from here: https://github.com/leogaggl/misc-scripts/blob/master/buttonpressed.sh
-
fix-hocr.xml from here (fixes html embed placement in final pdf): https://bugs.launchpad.net/cuneiform-linux/+bug/623438/comments/60
-
also helpful: