-
-
Notifications
You must be signed in to change notification settings - Fork 15.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvidia-container-toolkit: only mount existing paths in the host #319772
nvidia-container-toolkit: only mount existing paths in the host #319772
Conversation
@ereslibre may I ask you to instead consider whether it's possible to write a NixOS test covering all these symptoms we've had to address so far? E.g. maybe we could manually prepare a fake CDI json, and create temporary fake files in place of device nodes and userspace drivers, and then run a podman/docker container with a script ensuring that all of these fake paths are mounted correctly, including referential integrity ( This would be very useful for our migration to |
I think this is a good idea. I have never written NixOS tests but I will take a look at existing ones and update the PR. |
@ereslibre actually, I now think I shouldn't have delayed merging this; we'll still need the tests though |
nixos/modules/services/hardware/nvidia-container-toolkit/cdi-generate.nix
Outdated
Show resolved
Hide resolved
Tests are WIP (ereslibre/nixpkgs@nvidia-ctk-mount-existing-paths-only...ereslibre:nixpkgs:nvidia-ctk-mount-existing-paths-only-wip); I'd like to finish them before 29th june. Otherwise I fear I'll have to resume second half of July. The "referential integrity" at this time is pretty "poor man"; not sure if you had any other approach in mind. |
e86c013
to
8d52c10
Compare
nixos/modules/services/hardware/nvidia-container-toolkit/cdi-generate.nix
Show resolved
Hide resolved
@@ -32,6 +35,18 @@ function cdiGenerate { | |||
--nvidia-ctk-path ${lib.getExe' nvidia-container-toolkit "nvidia-ctk"} | |||
} | |||
|
|||
cdiGenerate | \ | |||
${lib.concatStringsSep " | " allJqMounts} > $RUNTIME_DIRECTORY/nvidia-container-toolkit.json | |||
function additionalMount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well one comment is that this could be rewritten with pythonMinimal
or a similar language that doesn't try so much to make this painful (you could remove concatMapStringsSep and just export a json for the script)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But as commented before, if you say this is the current iteration I'll go with it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll work on a reimplementation of this logic with Python in future PR's, if that's fine with you.
d223970
to
8e19362
Compare
b3ac4d8
to
9d382b5
Compare
9d382b5
to
d665ca4
Compare
@SomeoneSerge: is this good to merge on its current status? |
open = true; | ||
package = config.boot.kernelPackages.nvidiaPackages.stable.open; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ehhh it's stupid that open
doesn't change the default to... an open driver
…us from the context
open = true; | ||
package = config.boot.kernelPackages.nvidiaPackages.stable.open; | ||
}; | ||
opengl.enable = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why Ofborg isn't complaining, we have a mkRenamedModule
for this one
I pushed a few cosmetic changes, tell me if they're ok by you |
Thank you @SomeoneSerge! LGTM :) |
@SomeoneSerge gentle reminder for merging if everything is fine |
Successfully created backport PR for |
Description of changes
Although the final fix for this kind of issues is #290609 (
unbreak: we're currently mounting incomplete closures; need to use exportReferencesGraph.
), this adds yet another fix for a symptom.Fixes: #319201
I will address the
exportReferencesGraph
soon that I expect will fix all these problems once and for all, but fix this problem in the meantime.Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.