Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include ptxas in cudatoolkit package #15

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jayfurmanek
Copy link

@jayfurmanek jayfurmanek commented Mar 12, 2021

It was discovered that XLA does not work with the latest TensorFlow GPU builds:
AnacondaRecipes/tensorflow_recipes#24 (comment)

This is because XLA requires the ptxas compiler and libdevice.10.bc.

This PR has two changes to the cudatoolkit package to fix XLA:

  • the libdevice.10.bc package is in the wrong location for XLA. It's shipped in $CONDA_HOME/lib when it should be in $CONDA_HOME/nvvm/libdevice/ so this PR copies it to this location (and keeps it in /lib too in case some things need it there)

  • the ptxas binary, which is used by the XLA compiler, is not included in the package at all currently so this PR adds it into $CONDA_HOME/bin

Unfortunately, this feedstock doesn't have any branches for older cuda versions, so this PR is only against the latest (11.0).

@jayfurmanek
Copy link
Author

@katietz FYI

@jayfurmanek
Copy link
Author

@jjhelmus

@jayfurmanek
Copy link
Author

Note I didn't test this on Windows since I don't have a Windows machine.

@jakirkham
Copy link

@jayfurmanek would suggest submitting this to the conda-forge feedstock ( https://github.com/conda-forge/cudatoolkit-feedstock )

Also Jonathan left Anaconda FYI

@leofang
Copy link

leofang commented Mar 12, 2021

@jayfurmanek Could you also fire up a PR to https://github.com/conda-forge/cudatoolkit-feedstock if you have time? 🙂 It is maintained by some NVIDIA folks so it's in a better shape. I am less certain about whether or not ptxas is redistributable as per NVIDIA's EULA, but the change for libdevice.10.bc looks like no harm at least for Linux.

btw I think Jonathon is no longer a maintainer here. You might have to ping someone else from Anaconda (try https://gitter.im/conda/conda).

cc: @jakirkham @kkraus14

@jayfurmanek
Copy link
Author

@jakirkham What do you think about ptxas as being redistributable? It is part of the Cuda Toolkit and presumably under the same terms?

I can look at putting this up in forge as well.

@kkraus14
Copy link

@jakirkham What do you think about ptxas as being redistributable? It is part of the Cuda Toolkit and presumably under the same terms?

I can look at putting this up in forge as well.

ptxas is not redistributable per the EULA: https://docs.nvidia.com/cuda/eula/index.html#attachment-a

@kkraus14
Copy link

Will ping some folks internally to see what can be done here

@leofang
Copy link

leofang commented Mar 14, 2021

Aha John and I replied at the same time!

@katietz
Copy link

katietz commented Mar 15, 2021

The part about libdevice.10.bc is for sure something I would merge into Anconda's recipe. The ptxas part needs to be delayed. If there might be a solution found for it, ok. So I would suggest to make at least for the libdevice.10.bc part a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants