Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_processor / get_processor: do not pass (empty) ocrd_tool #1009

Merged
merged 3 commits into from
Mar 22, 2023
Merged

Conversation

bertsky
Copy link
Collaborator

@bertsky bertsky commented Mar 15, 2023

Follow-up to premature #999: there are more places where we pass an empty ocrd_tool around that will never be used.

Reasoning (from ocrd_tesserocr discussion):

That's despite ocrd==2.47.1. Where does the empty ocrd_tool come from???

Aha! It turns out that my #999 was premature: we also pass an unnecessary ocrd_tool in other places:

Now, the big story here is: none of these places is needed (or used) – neither currently, nor for #974 nor #884!

It's also clear now, why we never used this: you need runtime information (from the CLI) for this. Either we are already on the CLI (Processor CLI decorator in all processor modules) or we have to make foreign CLI calls and parse the JSON response from the concrete Processor subclasses (which don't live in our venv, or we don't know the class name of, or are bashlib anyway).

@bertsky
Copy link
Collaborator Author

bertsky commented Mar 15, 2023

Now also updated the tests to not pass on ocrd_tool for no reason.

BTW, we should try to cover cached processor instances in the processor tests, too.

@bertsky
Copy link
Collaborator Author

bertsky commented Mar 17, 2023

@kba as this modifies existing API args, will necessitate a new minor – 2.48. I have scanned all ocrd_all code and have found no other use cases throwing ocrd_tool kwarg at run_processor other than in core's own regression tests. It may be that outside repos do this, though.

@bertsky
Copy link
Collaborator Author

bertsky commented Mar 18, 2023

@kba please prioritise. If you find any arguments against this, I can still try to find a different solution in ocrd_tesserocr, but as it stands, I don't see a reason. And since after #972, ocrd_tesserocr stopped working, this is blocking the next ocrd_all release.

@joschrew
Copy link
Contributor

joschrew commented Mar 21, 2023

My thoughts behind my approval: I see this PR as kind of reminder to not provide ocrd_tool=None in
a processor call (or in a run_processor invocation). This must be kept in mind / ensured for the
PR 972 and for other possible changes.

@kba kba merged commit f47ad25 into master Mar 22, 2023
@bertsky
Copy link
Collaborator Author

bertsky commented Mar 22, 2023

@kba for the v2.48 changelog:

Changed

@kba kba deleted the followup-999 branch January 24, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants