-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Update CTK to CUDA 11.0 #9
Conversation
Still need to test and verify all libs are included
There are `10` `11` and `110` versions in the 11.0 installer
Instead of being '11' it is '10' from the installer
Unable to use pre-link scripts as they do not show the msg and instead show a warning about pre-link scripts being removed
@mike-wendt Can you rebase now that #6 has been merged. Also adding |
Address merge-conflicts and update changes to work for CUDA 11 * upstream-master: Add override flag fix nonembedded extract fix for embedded image add ppc64le support # Conflicts: # build.py
I'm getting errors similar to |
The recent merge seems to have broken the process that was working. CUDA 11 appears to prefer '--toolkit' over '--extract' for CUDA 10.2 and earlier.
I think fixing the merge in my latest commit fixes this as I revereted to using @jjhelmus @jakirkham do you have any suggestions. Right now for testing I've been removing it but I know it is needed in the end package that is released. |
Attempting to use the changes from ppc64le merge before changing them
Will probably need to add a ppc64le override to 'lib64'
880f8d3
to
894ab09
Compare
This does not owkr either and causes failures during solve. Need to remove to unblock our CI.
These versions do not have any run constraints: CUDA 11 GA - https://anaconda.org/nvidia/cudatoolkit/files?version=11.0.194 Adding Error message using
Error using
Both of these were from docker builds running on a CPU-only node, but building CUDA images. As I mentioned in a commit we need something that works for CPU-only environments as well given we build all of our conda packages on CPU-only nodes with CUDA images. |
0208f0c
to
0f70af4
Compare
What does |
As far as? I've replaced the packages so I don't have any to test without rebuilding them |
Am trying to understand more about the machine where this conflict is showing up. |
It was running on an ubuntu 18.04 AWS node doing a docker build. The error comes from inside the docker build on any image. This is just one example Any of the failed CUDA 11 builds for this job have the same |
@jjhelmus ready for review and input on the above constraints issues. Thanks |
@jjhelmus would it be possible for you to review this again in the near future? We're trying to push out the RAPIDS 0.15 release and this update is needed to build CUDA 11 enabled conda packages in conda-forge for things like CuPy. Is there anything we can do on our end to help reduce the maintenance burden on you? |
@kkraus14 @mike-wendt This looks good outside of the question on how to constrain the package. I've been able to replicate the build on our machines for linux-64 and am trying a build on our linux-ppc64le machine as well. I need to do some more testing around the |
Was able to confirm this builds fine on $ git diff
diff --git a/meta.yaml b/meta.yaml
index 1f5361f..6ad4d6c 100644
--- a/meta.yaml
+++ b/meta.yaml
@@ -32,8 +32,8 @@ requirements:
- tqdm
# for run_exports
- {{ compiler('cxx') }}
- #run:
- # - __glibc >=2.17 # [linux]
+ run_constrained:
+ - _cuda >=11.0 I have linux packages built with this change that I plan on uploaded to |
@jjhelmus this is not in master currently - how did the ppc64le 10.2 pkg get published and have the constraint on the web but the tarball and included |
add run_constrained requirement on __cuda
Co-authored-by: jakirkham <jakirkham@gmail.com>
So I thought we were already hotfixing |
The |
We are but ideally the packages would have the constraint included rather than patched in. |
This is my point on why is this change necessary here when it is obvious it is being added elsewhere. Now I know where. |
Ok, that sounds fine. On the building point, could you please check whether this ( #9 (comment) ) works, Mike? |
@jakirkham I have to rebuild this package and then try to build images which is an hour or more of work. Given we're in the middle of a release I don't have that time to troubleshoot this at the moment. If you're both happy with this then I would say merge and publish. I still have pkgs without the constraint so we won't be impacted but my suspicion is this is a larger issue. From my view an image that is |
With conda 4.8.4 I'm able to build packages from either of these two recipes if CONDA_OVERRIDE_CUDA=11.0 is set the the shell prior to calling conda build: constrained: package:
name: test
version: 1.0.0
requirements:
run_constrained:
- __cuda >=11.0
test:
commands:
- echo "Hi" run: package:
name: test
version: 2.0.0
requirements:
run:
- __cuda >=11.0
test:
commands:
- echo "Hi" test-1.0.0 (the constraned version) is not install-able with conda 4.8.4 on system without the CUDA 11 driver but it can with earlier version of conda. test-2.0.0 is not install-able without the CUDA 11 driver with both conda 4.8.4 or earlier versions. |
No worries @mike-wendt. Just trying to make sure you have a path forward 🙂 |
Merging. linux-64 and linux-ppc64le packages should be available on |
In addition add NVIDIA EULA and update about section to reflect the contents of this package.
This follows initial work in #7 for the CUDA 11RC