Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loosen v1 microservice dependencies #4911

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

mwm5945
Copy link
Contributor

@mwm5945 mwm5945 commented Jun 12, 2023

What this PR does / why we need it:
Resolves #4899. Loosens requirements to allow for ML frameworks to define them.

Which issue(s) this PR fixes:

Fixes #4899

Special notes for your reviewer:

@seldondev
Copy link
Collaborator

Hi @mwm5945. Thanks for your PR.

I'm waiting for a SeldonIO or todo member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository.

@RafalSkolasinski
Copy link
Contributor

/test integration

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 12, 2023

Will re-open in a bit--some "enterprise" stuff going on over here :)

@mwm5945 mwm5945 reopened this Jun 13, 2023
@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 13, 2023

@RafalSkolasinski should be gtg now! thanks!

@ukclivecox
Copy link
Contributor

/test integration

@ukclivecox
Copy link
Contributor

/test notebooks

@ukclivecox ukclivecox added the v1 label Jun 17, 2023
@ukclivecox
Copy link
Contributor

There seem to be some python lint errors:

keyring 23.13.1 requires importlib-metadata>=4.11.4; python_version < "3.12", but you have importlib-metadata 4.2.0 which is incompatible.
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
black \
	--check ./ ../testing/scripts \
	--exclude "(proto|seldon_core/proto/|.eggs|.tox)"
would reformat setup.py
Oh no! 💥 💔 💥
1 file would be reformatted, 72 files would be left unchanged.
make: *** [Makefile:113: lint] Error 1

python/setup.py Outdated Show resolved Hide resolved
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 20, 2023

@igor-shai @cliveseldon should be fixed!

Copy link
Contributor

@adriangonz adriangonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good! Once tests pass, this should be good to go @mwm5945 🚀

@adriangonz
Copy link
Contributor

/test integration

@adriangonz
Copy link
Contributor

/test notebooks

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 21, 2023

@adriangonz I don't have access to see why the above tests failed, any ideas?

@adriangonz
Copy link
Contributor

It seems like flaky integration errors. I'll re-trigger the tests @mwm5945 .

@adriangonz
Copy link
Contributor

/retest

@adriangonz
Copy link
Contributor

/test integration

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 27, 2023

@adriangonz anything i can do on my end?

@adriangonz
Copy link
Contributor

It's unclear TBH - integration tests are not showing the logs with the failure for some reason. Most likely, it's something strange with our CI. I'll run them locally and get back to you.

In the meantime, could you rebase from master? There's a recent PR which removes flatbuffer completely from the seldon_core Python package.

@adriangonz
Copy link
Contributor

Just had a quick look locally @mwm5945, and the issue seems to be that protobuf is getting bumped to 4.x+, which causes the following error message:

Traceback (most recent call last):
  File "./conda_env_create.py", line 13, in <module>
    from seldon_core.microservice import PARAMETERS_ENV_NAME, parse_parameters
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/microservice.py", line 15, in <module>
    from seldon_core import wrapper as seldon_microservice
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/wrapper.py", line 10, in <module>
    import seldon_core.seldon_methods
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/seldon_methods.py", line 21, in <module>
    from seldon_core.proto import prediction_pb2
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/proto/prediction_pb2.py", line 19, in <module>
    from seldon_core.proto.tensorflow.core.framework import tensor_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__pb2
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/proto/tensorflow/core/framework/tensor_pb2.py", line 16, in <module>
    from . import resource_handle_pb2 as tensorflow_dot_core_dot_framework_dot_resource__handle__pb2
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/proto/tensorflow/core/framework/resource_handle_pb2.py", line 16, in <module>
    from . import tensor_shape_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__shape__pb2
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/proto/tensorflow/core/framework/tensor_shape_pb2.py", line 36, in <module>
    _descriptor.FieldDescriptor(
  File "/opt/conda/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 561, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

To workaround it, it's best to constrain protobuf to >=3.20.2,<4.0.0 . @mwm5945 once you rebase from master and add that constrain, we can try to re-trigger the tests.

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 28, 2023

hmm...that should be covered here, no?

extras = {"tensorflow": ["tensorflow", "protobuf>=3.20.2,<4.0.0"]}

@adriangonz
Copy link
Contributor

The tensorflow extra is only used on certain cases though. If not used, it will just use the default protobuf requirement from the main set of deps (which only says protobuf <= 5.0.0).

The wider problem with that is that the generated protobuf code is linked to a major version of protobuf. That is, protobuf classes generated with protobuf==4.x won't work with protobuf==3.x and viceversa. Therefore, the same package wouldn't work for both pip install seldon_core and pip install seldon_core[tensorflow].

So what I would do is move that protobuf>=3.20.2,<4.0.0 line out of the tensorflow extra and into the main set of dependencies.

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 28, 2023

problem there is that it creates conflicts with other sets of dependencies, such as RapisAI/cuML

@adriangonz
Copy link
Contributor

adriangonz commented Jun 28, 2023

Right, I see... one potential workaround is to ensure the "official" Docker images do use protobuf==3.20.3, while keeping the seldon_core package itself unconstrained. This can be as simple as a pip install protobuf==3.20.3 on the Dockerfile or having an actual lockfile.

Having said that though, it's unclear how that would work when seldon_core gets installed with protobuf==4.x on your own environment? As far as I can see you'd get the same error as the one shown in #4911 (comment)

@adriangonz
Copy link
Contributor

BTW: Do you know how cuml introduces the incompatibility with protobuf==3.20.3? I was having look at their reqs but I can't see protobuf listed there.

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 29, 2023

cuml/libcuml requires cudf, requires protobuf 4.x

@adriangonz
Copy link
Contributor

Mmmh that's annoying - in that case, what do you think about the workaround suggested in #4911 (comment)?

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 29, 2023

very haha. Im actually wondering why TF doesn't declare a version of protobuf in their requirements list....seems like a hard requirement that should be defined on their side--which would solve the issue here because Pip should resolve the version on our behalf....

the workaround is certainly less than elegant.... im by no means a python expert, is there something we can do like how importlib is installed here?

@adriangonz
Copy link
Contributor

As far as I'm aware, the main problem is that the protobuf stubs generated (which are statically generated and ship with the Python package - under the seldon_core.proto module) are dependent on the major version of protobuf. That is, stubs generated with protobuf==3.x will not work with protobuf==4.x and viceversa.

Having said that, I just noticed that tensorflow does list a requirement of protobuf>=3.20.3,<5.0.0dev. Therefore, perhaps there's a way to make the statically generated stubs compatible across major version of protobuf?

We'll have a deeper look at the above, and will post an update once we confirm if possible.

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jun 30, 2023

great, thanks! from my perspective it doesn't look like the issue comes from seldon's protos, rather the dependency on TF protos

@adriangonz
Copy link
Contributor

Hey @mwm5945,

We've had a deeper look at this one, and it does seem like gRPC stubs generated with protobuf==4.x continue working even if you downgrade later to protobuf==3.20.x. Therefore, the remaining bits for this PR (to make it work with protobuf==4.x) would be:

  1. Bump these dev deps in the python/requirements-dev.txt file (required to build the gRPC stubs with protobuf==4.x): grpcio-tools==1.56.0, mypy-protobuf==3.4.0.
  2. Re-generate the gRPC stubs (with protobuf==4.x installed locally!). This can be done with the make -C python build_apis target.

Once that's done, we can then re-trigger the integration tests to confirm that everything works as expected.

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jul 5, 2023

working on this--bit of a delay dealing with policies/corporate proxies :)

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jul 10, 2023

@adriangonz should be pushed now :)

@adriangonz
Copy link
Contributor

/test integration

@adriangonz
Copy link
Contributor

Thanks for making those changes @mwm5945 .

The integration tests seem to be failing now for an unrelated issue. Once we get that fixed in master, we'll retrigger the tests.

@adriangonz
Copy link
Contributor

Hey @mwm5945 , fix has now been merged in master (#5015). Could you rebase to make sure it's present on this branch?

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jul 11, 2023

@adriangonz updated!

@adriangonz
Copy link
Contributor

/test integration

@adriangonz
Copy link
Contributor

Mmmh it seems the protobuf incompatibility is still there for some reason. We'll try to replicate locally to get more info.

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jul 12, 2023

sure--let me know if there's anything i can help with!

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jul 25, 2023

@adriangonz just wanted to check in and see if there's anything I can do to help!

@adriangonz
Copy link
Contributor

Hey @mwm5945 ,

After having a deeper look into the integration tests failures, the issue seems to be with the dependencies installed to run the integration tests (and not the image / python package ones). This is the file under ./testing/scripts/dev_requirements.txt, which is quite outdated.

I've bumped those into a separate PR (#5058), so once that goes in we should be able to rebase this one and that should (hopefully) fix the integration tests.

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jul 26, 2023

great--i'll keep an eye out!! thanks!

@adriangonz
Copy link
Contributor

Hey @mwm5945 ,

Just as a heads up, we've just merged #5058, which should have the fix to get the integration tests running on this PR.

When you get a chance, if you rebase from master we should then be able to kick off again the integration tests.

@mwm5945
Copy link
Contributor Author

mwm5945 commented Jul 27, 2023

@adriangonz done!

@adriangonz
Copy link
Contributor

/test integration

1 similar comment
@adriangonz
Copy link
Contributor

/test integration

@seldondev
Copy link
Collaborator

@mwm5945: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
notebooks 9845645 link /test notebooks
integration 1ceccc5 link /test integration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. I understand the commands that are listed here.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

V1 Microservice has restrictive/conflicting dependencies with common ML frameworks
7 participants