Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaPackages: split outputs #224533

Closed
SomeoneSerge opened this issue Apr 3, 2023 · 4 comments
Closed

cudaPackages: split outputs #224533

SomeoneSerge opened this issue Apr 3, 2023 · 4 comments
Assignees
Labels
3.skill: trivial This is trivial to complete (typically find-and-replace) 6.topic: cuda Parallel computing platform and API

Comments

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Apr 3, 2023

Issue description

A tracking issue, redirected from #217780

Currently our build-cuda-redist-package.nix puts static archives, executables, and headers with shared libraries into the same output. This means we cannot meaningfully separate buildInputs from nativeBuildInputs, and that CUDAToolkit's static archives end up in runtime closures for packages like jax and pytorch.

Splitting the outputs should be trivial: add an output in build-cuda-redist-package.nix, move .a files to the new output.
There is a chance of breaking discovery for downstream packages though, which may take time to fix

CC @NixOS/cuda-maintainers

@alyssais alyssais added the 6.topic: cuda Parallel computing platform and API label Apr 4, 2023
@github-project-automation github-project-automation bot moved this to 🆕 New in CUDA Team Apr 12, 2023
@SomeoneSerge SomeoneSerge moved this from 🆕 New to Roadmap in CUDA Team Apr 12, 2023
@SomeoneSerge SomeoneSerge added the 3.skill: trivial This is trivial to complete (typically find-and-replace) label Apr 12, 2023
@ghost
Copy link

ghost commented Apr 22, 2023

Currently our build-cuda-redist-package.nix puts static archives, executables, and headers with shared libraries into the same output. This means we cannot meaningfully separate buildInputs from nativeBuildInputs, and that CUDAToolkit's static archives end up in runtime closures for packages like jax and pytorch.

buildInputs are used for dependencies needed at runtime, while nativeBuildInputs are used for dependencies needed only during the build process. This means there is no distinction between the dependencies required for building the package and the dependencies required for running the package.

Splitting the outputs should be trivial: add an output in build-cuda-redist-package.nix, move .a files to the new output.

I find this writing a split derivation. We add the outputs attributes list. out is the default output directory, and static is the new output to store the static libraries. Then in installPhase this refers to the $static variable and the $static/lib path that will store the static libraries.

outputs = [ "out" "static" ];

  nativeBuildInputs = [
    autoPatchelfHook
    # This hook will make sure libcuda can be found
    # in typically /lib/opengl-driver by adding that
    # directory to the rpath of all ELF binaries.
    # Check e.g. with `patchelf --print-rpath path/to/my/binary
    autoAddOpenGLRunpathHook
  ];

  dontBuild = true;

  # TODO: choose whether to install static/dynamic libs
  installPhase = ''
    runHook preInstall
    rm LICENSE
    mkdir -p $out $static/lib
    mv * $out
    runHook postInstall
  '';

What about other libraries, headers and so on that are not needed in the runtime closure? Should we separate those as well? For example: $out/include for header files that are not needed in the runtime closure? I see now that those files are much smaller in size so we will only move .a files.

Now how do we actually move those static libraries in the installPhase?

  • mkdir -p $out $static/lib creates the output directories for the default output and the static output.
  • mv * $out moves all the libraries into the default output directory.
  • next line should move the .a static libraries from the $out into the static directory $static/lib

I see for example at Issue #164141 that these static libraries are a general problem, and not so trivial it seems. I am looking for the possible ways to move those static libraries into the static directory.

I find that using find with -exec is a common way to do this. So the installPhase will then be defined like:

# TODO: choose whether to install static/dynamic libs
  installPhase = ''
    runHook preInstall
    rm LICENSE
    mkdir -p $out $static/lib
    mv * $out
    find $out -name "*.a" -exec mv {} $static/lib/ \;
    runHook postInstall
  '';

Now when I try to save the file I do not have permissions to do so:

Failed to save 'build-cuda-redist-package.nix': Unable to write file '/nix/store/26d6xg6m11gp22al1gxr1h8zdfmf9j94-source/pkgs/development/compilers/cudatoolkit/redist/build-cuda-redist-package.nix' (Unknown (FileSystemError): Error: EROFS: read-only file system, open '/nix/store/26d6xg6m11gp22al1gxr1h8zdfmf9j94-source/pkgs/development/compilers/cudatoolkit/redist/build-cuda-redist-package.nix')

I'm doing something dumb with the environment. I haven't understood the way one enters the nix shell or if this is needed here as well. First I did git checkout -b fix-cuda-static-libs and then the nix edit nixpkgs#cudaPackages.cuda_cudart but I think this isn't correct to simply update the file. Also if I simply run

nix eval --impure --expr "<nixpkgs>"
/nix/store/26d6xg6m11gp22al1gxr1h8zdfmf9j94-source

I have to investigate more. Reading the nixpkgs manual and watching Nixpkgs - Adding a package to unstable/master branch by Jon Ringer

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/cuda-team-roadmap-and-call-for-sponsors/29495/1

@ConnorBaker ConnorBaker self-assigned this Jun 26, 2023
@ConnorBaker ConnorBaker moved this from 🔮 Roadmap to 🏗 In progress in CUDA Team Jun 26, 2023
@ConnorBaker
Copy link
Contributor

I'll be working on this.

#229758 seems important for what we aim to do. I'll correspond with the author to see what their thoughts are.

@ConnorBaker
Copy link
Contributor

Closed by #240498.

@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in CUDA Team Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.skill: trivial This is trivial to complete (typically find-and-replace) 6.topic: cuda Parallel computing platform and API
Projects
Status: Done
Development

No branches or pull requests

4 participants