-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support post-release versions, publish v1.15.0.post1 #5
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this merges, @pentschev or @jameslamb could you test installing a ucxx wheel along with this specific version of libucx (the ucxx wheels will allow using 1.15.0 at runtime right now I believe) to ensure that things work the way we want on both CPU-only and GPU-enabled machines?
Yes I can do this. |
CI is stuck waiting for arm64 runners. Once those run and (hopefully) pass, I'll merge this and test with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jameslamb !
@vyasr @pentschev the CI pipeline from this PR has been stuck waiting for a runner for 2+ hours (build link), so the new wheels aren't up on the nightly index yet (https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/libucx-cu12/). To get around that, I did a minimal test of
Ran it with and without a GPU visible to the processes. # with GPU
docker run \
--rm \
--gpus 1 \
-v $(pwd):/opt/work \
-w /opt/work \
-it rapidsai/citestwheel:cuda12.2.2-ubuntu22.04-py3.10 \
bash ./test.sh
# no GPU
docker run \
--rm \
-v $(pwd):/opt/work \
-w /opt/work \
-it rapidsai/citestwheel:cuda12.2.2-ubuntu22.04-py3.10 \
bash ./test.sh Saw it succeed (after applying the modifications from rapidsai/ucxx#229) on both. I think that's enough evidence to move forward with publishing the other versions (1.14.0.post1 and 1.16.0.post1). But to be sure, tomorrow I'll try with the 1.15.0.post1 wheels in CI for rapidsai/ucx-py#1041. |
Running the example at `python/examples/basic.py` results in the following ```text [1715298582.923398] [f059c9da14bb:1 :0] parser.c:2033 UCX WARN unused environment variable: UCX_MEMTYPE_CACHE (maybe: UCX_MEMTYPE_CACHE?) [1715298582.923398] [f059c9da14bb:1 :0] parser.c:2033 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) Traceback (most recent call last): File "/opt/work/./python/examples/basic.py", line 259, in <module> main() File "/opt/work/./python/examples/basic.py", line 227, in main listener_ep.tag_send(Array(send_bufs[0]), tag=ucx_api.UCXTag(0)), AttributeError: module 'ucxx._lib.libucxx' has no attribute 'UCXTag'. Did you mean: 'UCXXTag'? ``` It looks to me like that suggestion is right, and that that class is callsed `UCXXTag`: https://github.com/rapidsai/ucxx/blob/2195ceabf35b404b3ae6b09f784d1d312b4b6fce/python/ucxx/_lib/tag.pyx#L8 This PR proposes the following: * fixing that reference * adding a print statement at the end so that you know the example reached the end successfully without having to inspect the exit code of the process ## Notes for Reviewers I found this because I was using this example to smoke test changes to the new ucx wheels: rapidsai/ucx-wheels#5 ### How I tested this ```shell docker run \ --rm \ --gpus 1 \ -v $(pwd):/opt/work \ -w /opt/work \ -it rapidsai/citestwheel:cuda12.2.2-ubuntu22.04-py3.10 \ pip install 'ucxx-cu12==0.38.*,>=0.0.0a0' && python ./python/examples/basic.py ``` Authors: - James Lamb (https://github.com/jameslamb) - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) URL: #229
these wheels are working with |
Awesome! |
Follow-up to #5. Proposes publishing a `1.14.1.post1` version, identical to version `1.14.1` except that `load_library()` will no longer raise exceptions in non-GPU environments. ## Notes for Reviewers Just putting this up to get in the CI run. Should probably wait to merge it until testing on rapidsai/ucx-py#1041 is done.
Follow-up to #5. Similar to #6. Proposes publishing a `1.16.0.post1` version, identical to version `1.16.0` except that `load_library()` will no longer raise exceptions in non-GPU environments. ## Notes for Reviewers Checked https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/libucx-cu12/ to confirm that it's exactly `1.16.0` that we want to do a post-release for.
Proposes adding some minimal `pre-commit` checks and a CI job to run them. I think this is a cheap, low-risk way to get a bit more release confidence in changes like #5. ## Notes for Reviewers This includes a very pared-down version of https://github.com/rapidsai/shared-workflows/blob/branch-24.06/.github/workflows/checks.yaml. I'm proposing not depending on `shared-workflows` because this repo doesn't follow the RAPIDS branching model, and because right now the need is so simply (just running `pre-commit run --all-files`). --------- Co-authored-by: Kyle Edwards <kyedwards@nvidia.com>
Contributes to rapidsai/build-planning#57.
libucx.load_library()
defined here tries to pre-loadlibcuda.so
andlibnvidia-ml.so
, to raise an informative error (instead of a cryptic one from a linker) if someone attempts to use the libraries from this wheel on a system without a GPU.Some of the projects using these wheels, like
ucxx
anducx-py
, are expected to be usable on systems without a GPU. See rapidsai/ucx-py#1041 (comment).To avoid those libraries needing to try-catch these errors, this proposes the following:
v1.15.0.post1
Notes for Reviewers
Proposing starting with
1.15.0.post1
right away, since that's the version thatucx-py
will use. I'm proposing the following sequence of PRs here (assuming downstream testing goes well):1.14.0.post1
1.16.0.post1