Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdenv: add stdenv.isAvailable pkg #245322

Closed
wants to merge 8 commits into from
Closed

stdenv: add stdenv.isAvailable pkg #245322

wants to merge 8 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jul 25, 2023

Motivation

The current meta.availableOn has two problems:

  • It is "shallow" -- a package is "available on" a platform even if its buildInputs are not
  • It only knows about the hostPlatform. For example, a package is "available" even when:
    • the compiler being used cannot emit code for the hostPlatform
    • the compiler being used cannot be built for the buildPlatform
    • the package's build process assumes it can execute hostPlatform binaries but !buildPlatform.canExecute hostPlatform

This PR fixes both problems.

Specific examples of multi-platform constraints
  • Some packages cannot build unless buildPlatform.canExecute hostPlatform, and we have no clean way to express this. Often (xdg-open, p11-kit, and others) these packages are optional dependencies, and the depending package needs to detect their unavailability so they can be omitted rather than simply failing the build.
  • LLVM can't be used if targetPlatform.isMips64n32, but GCC can
  • go is missing bootstrap binaries for a few platforms yet can generate code for those platforms, so we can cross-compile to these platforms but can't build a native compiler for them.
  • ghc is missing usable bootstrap binaries for powerpc64*, although it can emit code for (and even compile itself for) that platform.

Description of changes

This commit adds meta.availability.{build,host,target}.{bad,only}, which are computed automatically, and updates meta.availableOn to use these attributes.

  • For the hostPlatform this is done in a backwards-compatible way: meta.availability.host incorporates meta.{platforms,badPlatforms} from the package as well as meta.availability.host.{bad,only} from its buildInputs. Because the constraints are cumulative at each package, deep recursion (like checkMetaRecursively) is not needed; each package's meta.availability only looks at that package's immediate inputs.

  • For the buildPlatform and targetPlatform two additional package-provided meta attributes are needed: meta.badBuildPlatforms and meta.badTargetPlatforms. The latter is needed only for compiler-like packages.

    • I'm not sure badBuildPlatforms is useful; I don't have an example for it yet. However meta.availability.build.bad is definitely useful when we have a compiler written in its own language that can generate code for a platform but upstream doesn't provide "bootstrap binaries" for that platform. Two real-life examples: GHC on powerpc64le and golang on all flavors of MIPS.

This PR also adds a boolean, meta.requiresBuildCanExecuteHost, to signal a very common cross-compilation problem: build processes that assume the build platform can execute host platform binaries.

This PR also adds stdenv.isAvailable pkg which checks availability constraints for all three platforms (build, host, and target) as well as meta.requiresBuildCanExecuteHost -> buildPlatform.canExecute hostPlatform. This should be preferred over meta.availableOn platform pkg.

Concrete example

One use-case for this is being able to automatically compile all eligible packages using the mips64n32 ABI (which cuts memory usage dramatically for pointer-intensive data structures) and build on mips64n64 only those few packages which don't support n32 (usually because they're written in Rust or use boost-context).

The clumsy way of working around this would be to have buildRustPackage inject meta.badPlatforms values into the packages it creates, but that covers only Rust (not other LLVM-based compilers) and even then covers only those which use buildRustPackage. Really, it's the wrong layer at which to express these things.

meta.broken isn't helpful here either. It is shallow, and its "deep" version meta.available is based on throw, so it can't be used to do anything other than to abort eval (see example below). This is why meta.availableOn doesn't (and can't ever) understand meta.broken.

Includes
Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.11 Release Notes (or backporting 23.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@ghost ghost requested a review from Artturin July 25, 2023 06:04
@github-actions github-actions bot added 6.topic: stdenv Standard environment 6.topic: lib The Nixpkgs function library labels Jul 25, 2023
@ghost ghost marked this pull request as ready for review July 25, 2023 09:30
@ghost ghost marked this pull request as draft July 25, 2023 09:49
@ghost

This comment was marked as resolved.

@ghost ghost marked this pull request as ready for review July 25, 2023 10:01
Copy link
Member

@piegamesde piegamesde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it looks like you touched that code so you might know, why is assertValidity having both attrs and meta parameters? What's the difference between attrs.meta and meta?

pkgs/stdenv/generic/check-meta.nix Outdated Show resolved Hide resolved
pkgs/stdenv/generic/check-meta.nix Outdated Show resolved Hide resolved
pkgs/stdenv/generic/check-meta.nix Outdated Show resolved Hide resolved
@ghost
Copy link
Author

ghost commented Jul 25, 2023

Since it looks like you touched that code so you might know, why is assertValidity having both attrs and meta parameters? What's the difference between attrs.meta and meta?

Oh man was this frustrating.

attrs is whatever was passed to mkDerivation. It contains only what was passed in to mkDerivation, not anything that gets added to it.

meta is the final meta of the resulting derivation; it contains everything.

So for example, attrs.meta.validity doesn't (or shouldn't!) exist. meta.validity does exist.

The reason why attrs has to be passed in is to prevent infinite recursion (which I was bitten by several times while writing this PR). While you're creating the final meta attrset sometimes you need to reference "the meta that was passed to mkDerivation, prior to anything being added to it" because if you use the final meta you'll end up depending on your own results.

This is mainly because check-meta.nix uses a lot of } // if x then { constructions, and Nix attrsets are lazy in their attrvalues but are strict in their attrnames.

We could probably make this into a lot less of a footgun by arranging things so that the attrnames of meta (and its sub-attrsets) are fixed and don't depend in any way on the attrset passed to mkDerivation.

@Artturin Artturin requested a review from roberth August 1, 2023 16:27
Comment on lines 167 to 170
isAvailable = pkg:
lib.meta.checkAvailability buildPlatform pkg.meta.availability.build &&
lib.meta.checkAvailability hostPlatform pkg.meta.availability.host &&
lib.meta.checkAvailability targetPlatform pkg.meta.availability.target;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this is desirable.

So far we have been trying to deprecate the various stdenv.is* in favor of the more accurate stdenv.*Platform.is.

Copy link
Author

@ghost ghost Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the && in the code above.

I don't understand why this is desirable.

Because availability isn't a function of just the hostPlatform (or just the buildPlatform, or just the targetPlatform). You need all three in order to decide.

Perhaps a more-perfect solution would be to have stdenv.platform.{build,host,target} and then add stdenv.platform.isAvailable. That would place isAvailable the smallest attrset containing all the needed information. @Artturin, would this resolve your concern?

@@ -232,6 +232,8 @@ else let

outputs = outputs';

referencesOnBuild = nativeBuildInputs ++ propagatedNativeBuildInputs;
referencesOnHost = buildInputs ++ propagatedBuildInputs;
references = nativeBuildInputs ++ buildInputs

This comment was marked as outdated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pr has big performance impacts too (current ofborg eval time is ~11 minutes)

Percentages are more useful here: ~5% on all but one metric (and that one metric at +10% is not an allocation metric).

That said, there are still lots of optimization opportunities. If performance is the only remaining obstacle I will implement those. I believe the overhead can be brought down to ~1% but I don't want to put in that effort if it's going to be wasted.

Copy link
Author

@ghost ghost Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1c68dedefcb5ac3c2b2757fc076a1f9850c4c1a1 should improve this. Let's see what OfBorg says.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest eval report, at 4141be0. It's basically neutral; a mixed bag of under-5% changes (about half faster, half slower). The only thing that stands out is the +12% on nrLookups, but attrset lookups are fast and allocate no memory -- this metric is far less important than most of the others.

eval-report

@ghost
Copy link
Author

ghost commented Aug 1, 2023

Squashed.

@ghost
Copy link
Author

ghost commented Aug 2, 2023

It turns out that most of the eval overhead is due to people making wacky choices for meta.platforms (sometimes isGnu, sometimes isLinux, sometimes isFoundInMyPersonalPileOfHardware, etc). Since they're whitelists rather than blacklists you can't simply logical-OR your dependencies' lists. So the lists grow uncontrollably. This is a long-known problem with meta.platforms.

In practice, meta.platforms is totally useless for anything other than being the default value of meta.hydraPlatforms.

So I'm going to leave meta.platforms' current shallow behavior and simply treat it as what it is: a synonym for meta.hydraPlatforms and nothing more.

Adam Joseph added 6 commits August 2, 2023 11:48
This commit adds `stdenv.isAvailable pkg`, which checks `meta`
availability not only for the hostPlatform, but also for the
`buildPlatform` and `targetPlatform` using
`lib.meta.checkAvailability`.

Our current meta.{platforms,badPlatforms} scheme isn't capable of
expressing constraints that apply to anything other than the
hostPlatform.  For example:

- LLVM can't be used with targetPlatform.isMips64n32, but GCC can

The clumsy way of working around this is to have `buildRustPackage`
inject `meta.badPlatforms` values into the packages it creates, but
that covers only Rust (not other LLVM-based compilers) and only
those which use `buildRustPackage`.  Really, it's the wrong layer at
which to express these things.

This commit adds `meta.availability.{build,host,target}.{bad,only}`
which are computed automatically based on a package's dependencies.

This is done in a backwards-compatible way for the hostPlatform:
`meta.availability.host` will incorporate
`meta.{platforms,badPlatforms}` from both the package as well as all
of its `buildInputs`.  The constraints are cumulative, so deep
recursion (like checkMetaRecursively) is not needed; each package's
`meta.availability` only needs to look at that package's immediate
inputs.

This commit also causes check-meta.nix to utilize the information in
meta.badBuildPlatforms and meta.badTargetPlatforms.

In particular, the following test commands now behave correctly:

  # XFAIL: golang can't build *on* MIPS, but it can cross compile to it
  nix-instantiate . -A pkgsCross.mipsel-linux-gnu.go

  # SUCCESS: golang can't build *on* MIPS, but it can cross compile to it
  nix-instantiate . -A pkgsCross.mipsel-linux-gnu.dnscrypt-proxy2

  # XFAIL: golang can't target Mips64n32
  nix-instantiate . -A pkgsCross.mips64el-linux-gnuabin32.dnscrypt-proxy2

  # XFAIL: rust can't target Mips64n32
  nix-instantiate . -A pkgsCross.mips64el-linux-gnuabin32.tiny
We can't build a `go` compiler on MIPS due to lack of bootstrap
binaries.  However, go is in fact capable of emitting code for MIPS,
so we can cross compile to a MIPS target (useful for embedded
routers!).
@ghost
Copy link
Author

ghost commented Aug 2, 2023

Squashed.

@ghost
Copy link
Author

ghost commented Aug 2, 2023

Latest eval report, at 4141be0. It's basically neutral; a mixed bag of under-5% changes (about half faster, half slower). The only thing that stands out is the +12% on nrLookups, but attrset lookups are fast and allocate no memory -- this metric is far less important than most of the others.

That list-concats metric is bogus; Nix has a fast-path for x++[] and []++x:

https://github.com/NixOS/nix/blob/3723363697b3908a2f58dce3e706783b1c783414/src/libexpr/eval.cc#L1888-L1898

But the nrListConcat counter gets incremented even if you hit the fast-path:

https://github.com/NixOS/nix/blob/3723363697b3908a2f58dce3e706783b1c783414/src/libexpr/eval.cc#L1884

The far more important metric is nrListElems, which this PR actually improves by ~3%.

eval-report

@ghost
Copy link
Author

ghost commented Aug 2, 2023

Next push will be an experiment with having hashing replace sorting, to see which wins.

@ghost ghost marked this pull request as draft August 2, 2023 20:49
@ghost ghost mentioned this pull request Aug 17, 2023
25 tasks
@ghost ghost mentioned this pull request Aug 26, 2023
12 tasks
@bew
Copy link
Contributor

bew commented Nov 9, 2023

The PR is still in draft, what is missing before being ready for review?

@ghost ghost mentioned this pull request Jan 18, 2024
13 tasks
@ghost ghost closed this Jan 23, 2024
@ghost ghost deleted the pr/stdenv/isAvailable branch January 23, 2024 06:50
@bew
Copy link
Contributor

bew commented Jan 23, 2024

Why did you closed it?
Is it superseded by anything else?

@piegamesde
Copy link
Member

#50105 (comment) somebody else will have to pick up the work probably

@roberth
Copy link
Member

roberth commented Jan 23, 2024

Adam has closed many of his PRs. I don't think that was necessary, and I hope he's doing well.

@ghost
Copy link
Author

ghost commented Jan 23, 2024

I hope he's doing well.

I'm fine, Robert. Thanks for your concern.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants