Turning on legacy OCR engine mode #39

dmypstl · 2019-02-09T17:08:19Z

Whitelist and blacklist are not implemented in version 4.0 (issue) and user patterns do not work (issue). In this issue people recommend turning on legacy OEM using --oem 0 option flag. This option is not part of configs, but rather belong to the engine itself, like language.

Could we please enable more ocr options as arguments in tesseract::tesseract(), including oem to be able to temporarily switch to older version of the engine.

The text was updated successfully, but these errors were encountered:

jeroen · 2019-02-09T17:22:34Z

The main problem is that v3 uses a different format for training data than v4. So we would need to manage different sets of training data within a single r package.

dmypstl · 2019-02-09T17:26:42Z

I see. This is pretty unfortunate. Maybe you could tag latest stable version before engine update and invite people to optionally install it with remotes::install_github("ropensci/tesseract@v3.0.4") or whatever the last version was before it was updated to 4.0. I think found the relevant commit now, but it is pretty awkward to refer to it.

jeroen · 2019-02-10T11:44:40Z

The easiest way to install an old version of the R package is using MRAN snapshots:

install.packages('tesseract', repos = 'https://cran.microsoft.com/snapshot/2018-09-01/')

jeroen · 2019-07-25T20:54:24Z

The whitelist / blacklist options are now supported in tesseract 4.1 (on cran).

billdenney mentioned this issue Apr 8, 2019

Feature Request: Get all characters with confidence >x #41

Open

jeroen closed this as completed Jul 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Turning on legacy OCR engine mode #39

Turning on legacy OCR engine mode #39

dmypstl commented Feb 9, 2019 •

edited

Loading

jeroen commented Feb 9, 2019

dmypstl commented Feb 9, 2019

jeroen commented Feb 10, 2019

jeroen commented Jul 25, 2019

Turning on legacy OCR engine mode #39

Turning on legacy OCR engine mode #39

Comments

dmypstl commented Feb 9, 2019 • edited Loading

jeroen commented Feb 9, 2019

dmypstl commented Feb 9, 2019

jeroen commented Feb 10, 2019

jeroen commented Jul 25, 2019

dmypstl commented Feb 9, 2019 •

edited

Loading