Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaPackages: make getOutput work again #301180

Merged

Conversation

SomeoneSerge
Copy link
Contributor

@SomeoneSerge SomeoneSerge commented Apr 3, 2024

Not yet sure what the consequences are going to be, testing will show.

Description of changes

Currently, cudaPackages' manifest.nix breaks lib.getOutput. Specifically the following lines resolve into the same store path ("$out") repeated thrice (as discovered by @ConnorBaker):

(getDev libcublas)
(getLib libcublas)
(getOutput "static" libcublas)

The change must've slipped through during the "cuda-modules"/jetson support refactoring.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@SomeoneSerge SomeoneSerge added the 6.topic: cuda Parallel computing platform and API label Apr 3, 2024
@SomeoneSerge SomeoneSerge requested a review from ConnorBaker April 3, 2024 08:42
@ofborg ofborg bot requested review from aidalgol, mdaiter and samuela April 3, 2024 09:10
@ofborg ofborg bot added 11.by: package-maintainer This PR was created by the maintainer of the package it changes 10.rebuild-darwin: 101-500 10.rebuild-linux: 101-500 labels Apr 3, 2024
@SomeoneSerge
Copy link
Contributor Author

SomeoneSerge commented Apr 7, 2024

Testing in https://hercules-ci.com/github/SomeoneSerge/nixpkgs-cuda-ci/jobs/10443. I see that stuff like opencv and blender builds alright

@samuela
Copy link
Member

samuela commented Apr 7, 2024

Ooc why is it safe to remove this now? Have we landed a change to how the outputs work since this code was originally written?

@SomeoneSerge
Copy link
Contributor Author

@samuela I updated the description to provide some context

@ConnorBaker
Copy link
Contributor

I could have sworn this was something necessitated by our use of multiple outputs due to the multiple-outputs.sh setup hook but I can’t find that any more.

The only relevant reference I’ve seen so far is

if ! pkg ? outputSpecified || ! pkg.outputSpecified
, which maybe should be refactored? I don’t see the need for why this attribute should exist. (Or at the least, I suspect the condition in that function could be simplified.)

@SomeoneSerge have you verified that this doesn’t introduce extra stuff to any of the split component outputs? So long as it still works and the outputs and closures of consumers are as expected, I’m fine with this change.

@SomeoneSerge
Copy link
Contributor Author

SomeoneSerge commented Apr 7, 2024

have you verified that this doesn’t introduce extra stuff to any of the split component outputs

Well grep in a current revision of nixpkgs doesn't show any matches in bash scripts, so I think it's very probably that it can't. But no, I did not manually compare the outputs. Would you like me to?

which maybe should be refactored? I don’t see the need for why this attribute should exist. (Or at the least, I suspect the condition in that function could be simplified.)

I guess pkg.outputSpecified or false would be more readable. I hope that's valid in nix 2.13 too 🙃 .

Speaking of refactoring, I think getOutput = output: pkg: assert (!(pkg.outputSpecified or false)) || pkg.outputName == output; pkg.${output} or pkg.out or pkg would've made more sense than just ignoring output. This way we'd have noticed the saxpy regression early on

@ConnorBaker
Copy link
Contributor

Looked at the outputs of a few different programs and didn't see anything out of place. As an example, here's libcublas:

On this PR:

$ nix build --impure -L .#cudaPackages.libcublas^*
$ for f in result*; do echo "Looking at $f"; nix path-info -rSsh "./$f"; done
Looking at result
/nix/store/7n0mbqydcipkpbxm24fab066lxk68aqk-libunistring-1.1         	   1.7M	   1.7M
/nix/store/a2fqd5k8ymb2cadb5awp5b90a5m8acln-gcc-13.2.0-libgcc        	 155.8K	 155.8K
/nix/store/rxganm4ibf31qngal3j3psp20mak37yy-xgcc-13.2.0-libgcc       	 155.8K	 155.8K
/nix/store/s32cldbh9pfzd9z82izi12mdlrw0yf8q-libidn2-2.3.7            	 352.7K	   2.1M
/nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5             	  28.7M	  31.0M
/nix/store/c2yb135iv4maadia5f760b3xhbh6jh61-gcc-13.2.0-lib           	   8.7M	  39.8M
/nix/store/l8ylz3d598vm310zy5bg5mkd77ahjxwf-libcublas-12.2.5.6-lib   	 604.1M	 643.9M
/nix/store/8q4q9vl32hy63bmmdd2wc78zqn9wqc1x-libcublas-12.2.5.6-dev   	 549.9K	 644.5M
/nix/store/yynka8djck214kpjgwrvbnhw5ij46xsi-libcublas-12.2.5.6-static	 889.8M	 889.8M
/nix/store/m5774innl5b78vc8yg1jm1wq0fii6m4q-libcublas-12.2.5.6       	 499.2K	   1.5G
Looking at result-dev
/nix/store/7n0mbqydcipkpbxm24fab066lxk68aqk-libunistring-1.1      	   1.7M	   1.7M
/nix/store/a2fqd5k8ymb2cadb5awp5b90a5m8acln-gcc-13.2.0-libgcc     	 155.8K	 155.8K
/nix/store/rxganm4ibf31qngal3j3psp20mak37yy-xgcc-13.2.0-libgcc    	 155.8K	 155.8K
/nix/store/s32cldbh9pfzd9z82izi12mdlrw0yf8q-libidn2-2.3.7         	 352.7K	   2.1M
/nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5          	  28.7M	  31.0M
/nix/store/c2yb135iv4maadia5f760b3xhbh6jh61-gcc-13.2.0-lib        	   8.7M	  39.8M
/nix/store/l8ylz3d598vm310zy5bg5mkd77ahjxwf-libcublas-12.2.5.6-lib	 604.1M	 643.9M
/nix/store/8q4q9vl32hy63bmmdd2wc78zqn9wqc1x-libcublas-12.2.5.6-dev	 549.9K	 644.5M
Looking at result-lib
/nix/store/7n0mbqydcipkpbxm24fab066lxk68aqk-libunistring-1.1      	   1.7M	   1.7M
/nix/store/a2fqd5k8ymb2cadb5awp5b90a5m8acln-gcc-13.2.0-libgcc     	 155.8K	 155.8K
/nix/store/rxganm4ibf31qngal3j3psp20mak37yy-xgcc-13.2.0-libgcc    	 155.8K	 155.8K
/nix/store/s32cldbh9pfzd9z82izi12mdlrw0yf8q-libidn2-2.3.7         	 352.7K	   2.1M
/nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5          	  28.7M	  31.0M
/nix/store/c2yb135iv4maadia5f760b3xhbh6jh61-gcc-13.2.0-lib        	   8.7M	  39.8M
/nix/store/l8ylz3d598vm310zy5bg5mkd77ahjxwf-libcublas-12.2.5.6-lib	 604.1M	 643.9M
Looking at result-static
/nix/store/yynka8djck214kpjgwrvbnhw5ij46xsi-libcublas-12.2.5.6-static	 889.8M	 889.8M

On master:

$ nix build --impure -L .#cudaPackages.libcublas^*
$ for f in result*; do echo "Looking at $f"; nix path-info -rSsh "./$f"; done
Looking at result
/nix/store/a2fqd5k8ymb2cadb5awp5b90a5m8acln-gcc-13.2.0-libgcc        	 155.8K	 155.8K
/nix/store/rxganm4ibf31qngal3j3psp20mak37yy-xgcc-13.2.0-libgcc       	 155.8K	 155.8K
/nix/store/7n0mbqydcipkpbxm24fab066lxk68aqk-libunistring-1.1         	   1.7M	   1.7M
/nix/store/s32cldbh9pfzd9z82izi12mdlrw0yf8q-libidn2-2.3.7            	 352.7K	   2.1M
/nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5             	  28.7M	  31.0M
/nix/store/c2yb135iv4maadia5f760b3xhbh6jh61-gcc-13.2.0-lib           	   8.7M	  39.8M
/nix/store/gk7iybbaak978qr2kchzbzpg5alkkxza-libcublas-12.2.5.6-lib   	 604.1M	 643.9M
/nix/store/9bq0574rddqfk8acd429m2p8ywc4fr03-libcublas-12.2.5.6-dev   	 549.9K	 644.5M
/nix/store/fq4qmlz74j8l3yf56vby7j5121sjfhz2-libcublas-12.2.5.6-static	 889.8M	 889.8M
/nix/store/7d3lkvaqvjf3765zmvznmkkn5wrkjmdk-libcublas-12.2.5.6       	 499.2K	   1.5G
Looking at result-dev
/nix/store/7n0mbqydcipkpbxm24fab066lxk68aqk-libunistring-1.1      	   1.7M	   1.7M
/nix/store/a2fqd5k8ymb2cadb5awp5b90a5m8acln-gcc-13.2.0-libgcc     	 155.8K	 155.8K
/nix/store/rxganm4ibf31qngal3j3psp20mak37yy-xgcc-13.2.0-libgcc    	 155.8K	 155.8K
/nix/store/s32cldbh9pfzd9z82izi12mdlrw0yf8q-libidn2-2.3.7         	 352.7K	   2.1M
/nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5          	  28.7M	  31.0M
/nix/store/c2yb135iv4maadia5f760b3xhbh6jh61-gcc-13.2.0-lib        	   8.7M	  39.8M
/nix/store/gk7iybbaak978qr2kchzbzpg5alkkxza-libcublas-12.2.5.6-lib	 604.1M	 643.9M
/nix/store/9bq0574rddqfk8acd429m2p8ywc4fr03-libcublas-12.2.5.6-dev	 549.9K	 644.5M
Looking at result-lib
/nix/store/7n0mbqydcipkpbxm24fab066lxk68aqk-libunistring-1.1      	   1.7M	   1.7M
/nix/store/a2fqd5k8ymb2cadb5awp5b90a5m8acln-gcc-13.2.0-libgcc     	 155.8K	 155.8K
/nix/store/rxganm4ibf31qngal3j3psp20mak37yy-xgcc-13.2.0-libgcc    	 155.8K	 155.8K
/nix/store/s32cldbh9pfzd9z82izi12mdlrw0yf8q-libidn2-2.3.7         	 352.7K	   2.1M
/nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5          	  28.7M	  31.0M
/nix/store/c2yb135iv4maadia5f760b3xhbh6jh61-gcc-13.2.0-lib        	   8.7M	  39.8M
/nix/store/gk7iybbaak978qr2kchzbzpg5alkkxza-libcublas-12.2.5.6-lib	 604.1M	 643.9M
Looking at result-static
/nix/store/fq4qmlz74j8l3yf56vby7j5121sjfhz2-libcublas-12.2.5.6-static	 889.8M	 889.8M

It may be surprising that the dev output has a dependency on lib, but that's because it has the pkgconfig files and so a dependency is introduced that way.

@ConnorBaker ConnorBaker merged commit 0fcbda8 into NixOS:master Apr 8, 2024
30 checks passed
@ConnorBaker
Copy link
Contributor

@SomeoneSerge @samuela looks like I wasn't thorough enough: this change broke compilation of NCCL (and I suspect other downstream consumers which access components unqualified): https://gist.github.com/ConnorBaker/deda25fd1c2b89e76770a7f3a2eceb41

I don't fully understand what's happening here.

Luckily, there aren't that many occurrences across Nixpkgs:

[connorbaker@nixos-desktop:~/Packages/nixpkgs]$ rg "outputSpecified"
lib/customisation.nix
350:            outputSpecified = true;

lib/tests/misc.nix
329:      { dev.outPath = "/dev"; outPath = "/default"; outputSpecified = true; }

lib/attrsets.nix
1763:    if ! pkg ? outputSpecified || ! pkg.outputSpecified

pkgs/build-support/buildenv/default.nix
64:        (if (! drv ? outputSpecified || ! drv.outputSpecified)

pkgs/development/compilers/llvm/15/default.nix
94:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*

pkgs/development/compilers/llvm/18/default.nix
90:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*

pkgs/development/compilers/llvm/12/default.nix
65:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*

pkgs/development/compilers/llvm/16/default.nix
95:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*

pkgs/development/compilers/llvm/git/default.nix
95:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*

pkgs/development/compilers/llvm/13/default.nix
91:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*

pkgs/development/compilers/llvm/14/default.nix
88:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*

pkgs/development/compilers/llvm/9/default.nix
62:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*
70:    llvm-polly = tools.libllvm-polly.lib // { outputSpecified = false; };

pkgs/development/compilers/llvm/17/default.nix
90:    # we need to reintroduce `outputSpecified` to get the expected behavior e.g. of lib.get*

pkgs/development/cuda-modules/generic-builders/manifest.nix
324:  # anything of the sort. To remedy this, we set outputSpecified to true, and use
327:  outputSpecified = true;
347:    # causes Nix to prefer that output over the others if outputSpecified isn't set).

pkgs/tools/typesetting/tex/texlive/default.nix
119:  toTLPkgList = drv: if drv.outputSpecified or false
135:      // { outputSpecified = true; tlOutputName = tlTypeToOut.${p.tlType}; };
140:    builtins.removeAttrs mainDrv [ "outputSpecified" ];

pkgs/tools/typesetting/tex/texlive/build-texlive-package.nix
82:    outputSpecified = true;
204:builtins.removeAttrs mainDrv [ "outputSpecified" ]

pkgs/tools/typesetting/tex/texlive/build-tex-env.nix
49:  isOldPkgList = p: ! p.outputSpecified or false && p ? pkgs && builtins.all (p: p ? tlType) p.pkgs;
62:          (p: p.outputSpecified or false -> builtins.elem (p.tlOutputName or p.outputName) [ "out" "tex" "tlpkg" ])
78:    specified = builtins.partition (p: p.outputSpecified or false) all;
118:    formatPkgs = lib.filter (p: p ? formats && (p.outputSpecified or false -> p.tlOutputName or p.outputName == "tex") && builtins.any (f: f.enabled or true) p.formats) all;
183:          (p: p.pname + (lib.optionalString (p.outputSpecified or false) " (${p.tlOutputName or p.outputName})"))

pkgs/tools/typesetting/tex/texlive/combine-wrapper.nix
32:  toSpecified = { tlType, ... }@drv: drv // { outputSpecified = true; tlOutputName = tlTypeToOut.${tlType}; };

My gut tells me it's got to be one of these:

  • lib/customisation.nix: definition of extendDerivation
  • lib/attrsets.nix: definition of getOutput and friends
  • pkgs/build-support/buildenv/default.nix: only other sensible reference to that variable

@magneticflux-
Copy link
Contributor

@ConnorBaker I just found magma failing to compile as well, complaining about missing nvcc when cudaSupport = true.

@SomeoneSerge
Copy link
Contributor Author

Testing in https://hercules-ci.com/github/SomeoneSerge/nixpkgs-cuda-ci/jobs/10443.

Damn. The lock file seems wrong 🙈, I loaded it in the repl and I can still see the outputSpecified in there

@SomeoneSerge
Copy link
Contributor Author

SomeoneSerge commented Apr 9, 2024

nix-repl> { inherit (pkgs.lib.last pkgs.cudaPackages.nccl.nativeBuildInputs) outputSpecified outputName pname; }
{ outputName = "dev"; outputSpecified = true; pname = "cuda_nvcc"; }
nix-repl> { inherit (pkgs.lib.last pkgs.magma.nativeBuildInputs) outputSpecified outputName pname; }
{ outputName = "dev"; outputSpecified = true; pname = "cuda_nvcc"; }

Something - maybe mkDerivation, maybe the splicing mechanism - replaces cuda_nvcc in nativeBuildInputs with getDev cuda_nvcc. I'd say the obvious thing is to add "${!outputBin}" to ${!outputDev}/nix-support/propagated-build-inputs.

This is the same as removing propagatedBuildOutputs = false. I suppose we'll run into cyclic references when we do?

@ConnorBaker
Copy link
Contributor

This is the same as removing propagatedBuildOutputs = false. I suppose we'll run into cyclic references when we do?

Correct -- that's the exact reason that I had to specify that flag as false.

@SomeoneSerge
Copy link
Contributor Author

SomeoneSerge commented Apr 9, 2024

 Correct -- that's the exact reason that I had to specify that flag as false.

There's a reference bin -> dev, at least because of bin/nvcc.profile referring to $dev/include. We could make a separate outputInclude, but I'm afraid if we edit

possibleOutputs = [
"bin"
"lib"
"static"
"dev"
"doc"
"sample"
"python"
];
...we'd might have to edit the manifests 😩

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: cuda Parallel computing platform and API 10.rebuild-darwin: 101-500 10.rebuild-linux: 101-500 11.by: package-maintainer This PR was created by the maintainer of the package it changes
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants