Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update 2023-03-16 #356

Merged
merged 8 commits into from
Mar 24, 2023
Merged

update 2023-03-16 #356

merged 8 commits into from
Mar 24, 2023

Conversation

kba
Copy link
Member

@kba kba commented Mar 16, 2023

OK, now with cleaned up commit history, discussion is in #355

@kba
Copy link
Member Author

kba commented Mar 16, 2023

CI fails but that might be due to the test being built based on the ocrd/core-cuda image, which is now very outdated.

I'm building the images locally now and if the build succeeds, will push the non-cuda images manually. Not ideal but since I will be AFK for a few days and this is a substantial update, I want to update what I can today.

@bertsky
Copy link
Collaborator

bertsky commented Mar 16, 2023

[update ocrd_keraslm](/OCR-D/ocrd_all/pull/356/commits/682324b9e7c2d8ff612b4a2ac63268ca17a914a5)

oops, that was not enough:

tensorflow-gpu 1.15.5 has requirement numpy<1.19.0,>=1.16.0, but you have numpy 1.19.5

@bertsky
Copy link
Collaborator

bertsky commented Mar 16, 2023

[update ocrd_keraslm](/OCR-D/ocrd_all/pull/356/commits/682324b9e7c2d8ff612b4a2ac63268ca17a914a5)

oops, that was not enough:

tensorflow-gpu 1.15.5 has requirement numpy<1.19.0,>=1.16.0, but you have numpy 1.19.5

Damn. It's even worse. The above requirement comes from PyPI TF 1.15 (only Py36 and Py37). But on Nvidia TF 1.15 (only Py37 and Py38) we have:

   tensorflow-gpu 1.15.5+nv23.2 depends on numpy<1.24 and >=1.22.0; python_version >= "3.7"

But >=1.22 itself was incompatible with the older h5py we need here...

@kba
Copy link
Member Author

kba commented Mar 16, 2023

OK, so, build is still running for medium and maximum, if they finish successfully, I'll push them tomorrow morning. minimum is updated.

@kba
Copy link
Member Author

kba commented Mar 16, 2023

and of course they started failing as soon as I wrote this :/

@bertsky
Copy link
Collaborator

bertsky commented Mar 17, 2023

But >=1.22 itself was incompatible with the older h5py we need here...

That turned out not true. There's a window of <1.24 >=1.22 for tensorflow-gpu==1.15.5+nv23.2. But h5py seems to be fine (confirmed by processing). There was another glitch with Protobuf, but not it seems to work for Py37 and Py38. Please update to master (OCR-D/ocrd_keraslm@9e3f5a0).

@bertsky
Copy link
Collaborator

bertsky commented Mar 17, 2023

There's a window of <1.24 >=1.22 for tensorflow-gpu==1.15.5+nv23.2. But h5py seems to be fine (confirmed by processing).

The problem is that here in ocrd_all, numpy does get updated to 1.24 by other modules, which have no restriction on its version and therefore indiscriminately pull the newest version. For example via shapely:

Collecting numpy>=1.14
...
tensorflow-gpu 1.15.5+nv23.2 requires numpy<1.24,>=1.22.0; python_version >= "3.7", but you have numpy 1.24.2 which is incompatible.

So again, this is pip playing dumb. Not sure how to fix this, except with our typical posterior workaround:

. $(ACTIVATE_VENV) && $(SEMPIP) pip install imageio==2.14.1 "tifffile<2022"

@bertsky
Copy link
Collaborator

bertsky commented Mar 17, 2023

Not sure if this is new:

ocrd-dinglehopper -h
Traceback (most recent call last):
  File "/usr/local/bin/ocrd-dinglehopper", line 5, in <module>
    from qurator.dinglehopper.ocrd_cli import ocrd_dinglehopper
ModuleNotFoundError: No module named 'qurator.dinglehopper'

Dinglehopper was installed with pip install -e ...

@bertsky
Copy link
Collaborator

bertsky commented Mar 17, 2023

Not sure if this is new:

ocrd-dinglehopper -h
Traceback (most recent call last):
  File "/usr/local/bin/ocrd-dinglehopper", line 5, in <module>
    from qurator.dinglehopper.ocrd_cli import ocrd_dinglehopper
ModuleNotFoundError: No module named 'qurator.dinglehopper'

@mikegerber This is caused by qurator-spk/dinglehopper#76.

To get it working again, I have to revert both qurator-spk/dinglehopper@b4ac24a and qurator-spk/dinglehopper@833efa3.

Note that the documentation explicitly warns that every package in the namespace must follow the (same) convention.

(BTW, eynollah still uses the pkg_resources style namespace pkg.)

@bertsky
Copy link
Collaborator

bertsky commented Mar 18, 2023

Note: since we are now on Py38, we also require

Also, numpy has to be held at <1.24 not only for h5py and TF, but various usage in our own code (ocrd_cis, ocrd_anybaseocr, ocrd_segment). Adapting to the 1.21 type deprecations will be a lot of work, and we must get on with it soon. I wonder where would be the best place to do that (core requirements, or in a make or docker rule, or in ocrd_all)?

As for models trained with ocrd_segment, cor-asv-ann and ocrd_keraslm (all HDF5): none of these work anymore, everything must be converted to SavedFormat first (which I have not even begun working on).

It would really help if we had something for #112. I don't think Quiver achieves much coverage among processors and models currently...

Overall I must say the current release sprint has been the worst of the worst. The move to Python 3.8 / Ubuntu 20 was premature IMO. It should have been tested independently, as I have warned 2yrs ago. By forcing this, we mixed all problems into one big maze:

  • fixes and updates in core we have been waiting for
  • a working CUDA version of core (after more 1yr)
  • Python 3.8 requires newer CUDA base image (CUDA toolkit and driver, libcudnn)
  • Python 3.8 pickler changes, which necessitate migrating HDF5 models to SavedFormat
  • Python 3.8 importlib changes
  • more recently: Shapely/Numpy/Protobuf changes
  • core's get_processor with additional workspace arg
  • additional glitches all over the place

@mikegerber
Copy link
Contributor

Not sure if this is new:

ocrd-dinglehopper -h
Traceback (most recent call last):
  File "/usr/local/bin/ocrd-dinglehopper", line 5, in <module>
    from qurator.dinglehopper.ocrd_cli import ocrd_dinglehopper
ModuleNotFoundError: No module named 'qurator.dinglehopper'

@mikegerber This is caused by qurator-spk/dinglehopper#76.

To get it working again, I have to revert both qurator-spk/dinglehopper@b4ac24a and qurator-spk/dinglehopper@833efa3.

Note that the documentation explicitly warns that every package in the namespace must follow the (same) convention.

(BTW, eynollah still uses the pkg_resources style namespace pkg.)

This is why I called this CAJ (computer-aided Jenga)...

@kba kba merged commit e1eb8b6 into master Mar 24, 2023
@stweil stweil deleted the update-2023-03-16-again branch June 30, 2023 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants