-
-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
haskell.compiler.ghc*: fix cross-built native GHC #243619
haskell.compiler.ghc*: fix cross-built native GHC #243619
Conversation
6cc98f1
to
e5222ff
Compare
e5222ff
to
6a40714
Compare
Hi, thanks for writing this PR! This is a really really big PR (not your fault; ghc still uses copy-paste-tweak style instead of switch-on-version style) in monolithic single-commit form which makes it very hard to review; would you mind breaking it up into a separate commit for each mechanical change (like "search and replace FOO with BAR" or "replace libffi with targetPackages.libffi")? Then we can focus our review effort on the remaining commits with non-mechanical changes, which should be short and focused. Also, have you tried building anything that uses Template Haskell? Last time I tried to get Unfortunately Template Haskell is widely used in the Haskell ecosystem; there are a couple of pervasive dependencies (like Vty) that use it, so everything that uses those needs it. |
That is kind of orthogonal to this PR which gets |
Ah, so, first line of the PR should probably be adjusted to:
|
@amjoseph-nixpkgs I assume by this you mean split it up by change (e.g. apply X change to each version) and not split it up by version (e.g. fix 9.4.5), right? Sure.
I have encountered a few errors that mentioned I haven't tried using the cross-compilers to build any TH-using code. As sternenseemann mentioned, this is more about fixing cross-compiling GHC itself. I assume Nixpkgs already supports building GHC as a cross-compiler, but if it doesn't, this PR also makes that possible. |
Correct!
Ah, thanks, now I get it! Yes this will be very useful, because the ancient GHC-supplied binaries we've been using to bootstrap on |
|
6a40714
to
40f2e8f
Compare
@amjoseph-nixpkgs I have split this into 11 separate (hopefully atomic) commits. I've also spent much of my week trying to compile GHC 9.6.2 natively on RISC-V using an unregisterised GHC 9.4.5 cross-compiled using these changes, and I've figured out what it takes. pkgs.haskell.compiler.ghc962.override (old: rec {
bootPkgs = pkgs.haskell.packages.ghc945.override {
buildHaskellPackages = bootPkgs;
ghc = let
passthru = {
targetPrefix = "";
enableShared = false;
hasHaddock = false;
llvmPackages = pkgs.llvmPackages_12;
haskellCompilerName = "ghc-9.4.5";
};
in passthru // {
version = "9.4.5";
outPath = builtins.storePath boot/ghc;
inherit passthru;
meta = {
license = lib.licenses.bsd3;
platforms = [ "riscv64-linux" ];
};
};
overrides = self: super: {
mkDerivation = args: super.mkDerivation ({
enableLibraryProfiling = false;
} // args);
# These test suites don't compile with boot GHC
alex = pkgs.haskell.lib.compose.dontCheck super.alex;
data-array-byte = pkgs.haskell.lib.compose.dontCheck super.data-array-byte;
doctest = pkgs.haskell.lib.compose.dontCheck super.doctest;
hashable = pkgs.haskell.lib.compose.dontCheck super.hashable;
optparse-applicative = pkgs.haskell.lib.compose.dontCheck super.optparse-applicative;
QuickCheck = pkgs.haskell.lib.compose.dontCheck super.QuickCheck;
temporary = pkgs.haskell.lib.compose.dontCheck super.temporary;
vector = pkgs.haskell.lib.compose.dontCheck super.vector;
};
};
}) And I needed to patch Nixpkgs so that Hadrian wouldn't try to use RTS flags only supported by the threaded runtime. diff --git a/pkgs/development/tools/haskell/hadrian/default.nix b/pkgs/development/tools/haskell/hadrian/default.nix
index 5911c34982b..da4d194e220 100644
--- a/pkgs/development/tools/haskell/hadrian/default.nix
+++ b/pkgs/development/tools/haskell/hadrian/default.nix
@@ -29,6 +29,8 @@ mkDerivation {
# Additionally we need to recompile it on every change of UserSettings.hs.
# See https://gitlab.haskell.org/ghc/ghc/-/merge_requests/1190
"-O0"
+ # Don't use threaded-only RTS flags at runtime
+ "-f-threaded"
];
isLibrary = false;
isExecutable = true; I'd like to make it possible to use a non-threaded GHC to build GHC, but I'm not sure how. Should it be as above? Should it be an overridable option to GHC which is propagated to Hadrian? Should it be a flag on GHC derivations similar to |
Is it possible to have Hydra cross-compile GHC for all platforms? It could potentially simplify bootstrapping on unsupported platforms by substituting the derivation instead of having to manually build it on a supported platform, copy the closure to the host, and write some rather elaborate code to convince Nixpkgs to use it as a boot compiler. In fact, we could even configure Nixpkgs to do so by default on unsupported platforms, so that users can just |
We can via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I took so long to get back to you! This looks great, I'll be sure to play around with it a bit as well. It is also somehow simpler than I expected, e.g. it is not necessary to set any make variables telling it to build stage 2?
All suggestions are only noted once, since the changes are duplicated (very neatly, I must say) over the expressions.
I normally dislike tentative changes that don't work yet, but I guess the changes to the hadrian expression are alright as the retain some measure of consistency with the old expressions!
Also since this PR was opened, 9.4.6.nix
was added, so we'll need to remember to take care of that at the end.
@@ -250,6 +251,9 @@ stdenv.mkDerivation (rec { | |||
|
|||
postPatch = "patchShebangs ."; | |||
|
|||
# GHC is unable to build a cross-compiler without this set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you be more specific why this is needed? I'm guessing the build->build CC needs to be exposed as $CC
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I have no idea myself.
I figured out it was needed by diffing the environment variables between stdenv.mkDerivation
and pkgsBuildTarget.stdenv.mkDerivation
(IIRC).
@@ -132,6 +132,7 @@ let | |||
pkgsBuildTarget.targetPackages.stdenv.cc | |||
] ++ lib.optional useLLVM buildTargetLlvmPackages.llvm; | |||
|
|||
buildCC = pkgsBuildHost.stdenv.cc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use buildPackages.stdenv.cc
here, not because it makes an actual difference, but because it makes everything a bit clearer, since it is the convention.
Alternatively, for consistency with the others, pkgsBuildBuild.targetPackages.stdenv.cc
.
@@ -278,6 +282,9 @@ stdenv.mkDerivation (rec { | |||
# LLVM backend on Darwin needs clang: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/codegens.html#llvm-code-generator-fllvm | |||
export CLANG="${buildTargetLlvmPackages.clang}/bin/${buildTargetLlvmPackages.clang.targetPrefix}clang" | |||
'' + '' | |||
export CC_STAGE0="${buildCC}/bin/${buildCC.targetPrefix}cc" | |||
export LD_STAGE0="${buildCC.bintools}/bin/${buildCC.bintools.targetPrefix}ld${lib.optionalString useLdGold ".gold"}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't reuse useGold
here, since it is about the targetCC
only. There is a possibility we are compiling with mismatched bintools (LLVM/GNU), so we'd need to consider this separately here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't believe that "use gold" isn't an option we can pass to the bintools-wrapper. It seems like that would be the right place for this kind of setting rather than repeating it in every package. But this is a criticism of bintools-wrapper, not a criticism of ghc or this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can now (#239247), but we want to use gold even if it is available, but not the default; so the setting wouldn't be global. (This does mean that we can simplify useGold
a bit though.)
@@ -326,7 +326,8 @@ stdenv.mkDerivation (rec { | |||
# `--with` flags for libraries needed for RTS linker | |||
configureFlags = [ | |||
"--datadir=$doc/share/doc/ghc" | |||
"--with-curses-includes=${ncurses.dev}/include" "--with-curses-libraries=${ncurses.out}/lib" | |||
"--with-curses-includes=${pkgsBuildHost.ncurses.dev}/include" | |||
"--with-curses-libraries=${pkgsBuildHost.ncurses.out}/lib" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this matter? Isn't terminfo
disabled as soon as there is any kind of cross compilation happening? Then it would make more sense to pass these flags only conditionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is supposedly disabled, but something still ends up using ncurses.
sed -i $out/lib/${targetPrefix}${passthru.haskellCompilerName}/settings \ | ||
-e "s!$CC!${installCC}/bin/${installCC.targetPrefix}cc!g" \ | ||
-e "s!$CXX!${installCC}/bin/${installCC.targetPrefix}c++!g" \ | ||
-e "s!$LD!${installCC.bintools}/bin/${installCC.bintools.targetPrefix}ld${lib.optionalString useLdGold ".gold"}!g" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here technically a separate useGold condition makes sense, but I don't think bintools can in practice differ between pkgsHostTarget
and pkgsBuildTarget
.
@@ -134,6 +134,7 @@ let | |||
|
|||
buildCC = pkgsBuildHost.stdenv.cc; | |||
targetCC = builtins.head toolsForTarget; | |||
installCC = pkgsHostTarget.gcc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkgsHostTarget.targetPackages.stdenv.cc
is nicer, because it respects the default compiler and doesn't statically require gcc.
# option will force it to do an unregistered build when set to true. | ||
# See https://gitlab.haskell.org/ghc/ghc/-/wikis/building/unregisterised | ||
# Registerised RV64 cross-compiler currently produces programs that segfault | ||
enableUnregisterised ? !stdenv.buildPlatform.isRiscV64 && stdenv.targetPlatform.isRiscV64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we check stdenv.hostPlatform.isRiscV64
? Or does it also segfault if the native compiler is cross-compiled? Is there an issue we can link to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my testing, I found that a registerised stage 1 cross-compiler outputs programs that segfault, meaning that both stage 1 (outputs broken programs) and stage 2 (is itself broken) are unusable. I don't see the need to check hostPlatform
, since targetPlatform
is RiscV64
whether we want stage 1 or 2.
There isn't an upstream issue yet, but I've been meaning to create one. I'm currently doing a native build of 9.2.8 to verify that it isn't already fixed in 9.6.2 (the only version I've tested on native so far). I'll add a link to it in the code once I create it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've since discovered that native RISC-V compilers also segfault, so I'll change this to stdenv.hostPlatform.isRiscV64 || stdenv.targetPlatform.isRiscV64
(hostPlatform
to ensure stage 2 GHC doesn't crash, targetPlatform
to ensure compiled Haskell programs don't crash).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I'll also include a link to a GHC issue.
To clarify, one option would be to use nixpkgs/pkgs/top-level/release-cross.nix Lines 45 to 50 in a140137
The question is of course if this makes sense for all platforms Alternatively we can make a new section in |
40f2e8f
to
6f791b4
Compare
This is no longer necessary due to the removal of 8.8.4.
See #173952, which is a fix for binutils-wrapper. cc-wrapper has the same problem: they reference the build platform's shell in the wrapper script when they should be referencing the host platform's shell. This results in GHC not being runnable on the host platform without emulating the build platform.
Emulation can be used as a workaround.
Done. I haven't limited it to x86_64, as I assume that cross-compiling should work on any build platform. |
This is great. Thanks for doing this! I hope we can use this to get "trusted" bootstrap tarballs for new platforms (riscv64) without waiting for GHCHQ to release bindists. I have not yet built any of it but looked over some of the changes. Unregisterised riscv64 buildsThis is a pity but of course unavoidable in general for new platforms. I would still like to get newer e.g. >= 9.6 registerised builds working if possible. I think @AlexandreTunstall had some trouble with registerised cross-compiled ghcs. Can you comment what exactly went wrong and which versions you tried? In my experience native registerised builds for >= 9.6 work out of the box. Also Debian ships with a native 9.4.7 registerised build which I believe they got working by building against llvm 15. Might be useful if you do any testing since that is less painful than starting from an unregisterised one. Here is the info of that build:
Interpreter in riscv64 hadrian buildsThe move to hadrian initially disabled the interpreter on riscv64. This is a bit minor if you just want a compiler to bootstrap. There are two upstream MRs enabling that again: |
# See https://gitlab.haskell.org/ghc/ghc/-/wikis/building/unregisterised | ||
# Registerised RV64 compiler produces programs that segfault | ||
# See https://gitlab.haskell.org/ghc/ghc/-/issues/23957 | ||
enableUnregisterised ? stdenv.hostPlatform.isRiscV64 || stdenv.targetPlatform.isRiscV64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registerised riscv64 builds of ghc >=9.6 work for me natively (build = host = target) and this is the default behaviour before this PR. Can you preserve that for all hadrian-based builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that true even when building more complex programs like Pandoc?
From my testing with 9.6.2, registerised initially appeared to work until I tried compiling something more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give building with LLVM 15 a try.
In the meantime, I'll undo this change in common-hadrian.nix
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For older versions of GHC, LLVM 15 is not officially supported, so I don't want to take the risk of subtly breaking them while trying to fix registerised builds.
Especially now that 9.6 is the default version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that true even when building more complex programs like Pandoc?
From my testing with 9.6.2, registerised initially appeared to work until I tried compiling something more complex.
Yes I got pandoc and shellcheck installed and tested basic functionality. Only cachix failed to compile with "error: cycle detected in build of '/nix/store/[..]-cachix-1.7.drv' in the references of output 'bin' from output 'out'" which I have yet to look into. I will try to rebuild a new registerised 9.6.4 from your branch and will report back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9.6.2 still segfaults on my old (~August) NixOS build when using LLVM 15.
[ 1 of 16] Compiling Language.Haskell.HsColour.Classify ( Language/Haskell/HsColour/Classify.hs, dist/build/Language/Haskell/HsColour/Classify.o, dist/bu>
/nix/store/mjlk72gj834v6lmbk9j9pvby88s686cz-stdenv-linux/setup: line 1597: 202 Segmentation fault (core dumped) ./Setup build
This was bootstrapped using a native unregisterised 9.2.8, which was itself booted from a cross-compiled 8.10.7 compiled with the original version of this PR. All native packages were compiled for RV64GC_Zba_Zbb
using gcc13Stdenv
.
Some (likely useless) kernel logs:
ghc_worker[772846]: unhandled signal 11 code 0x1 at 0x0000004e3584b582
CPU: 2 PID: 772846 Comm: ghc_worker Not tainted 6.5.0 #1-NixOS
Hardware name: StarFive VisionFive 2 v1.3B (DT)
epc : 0000004e3584b582 ra : 0000003ff3e06900 sp : 00000004e3ffa730
gp : 000000000006ea80 tp : 00000004e3fff8e0 t0 : 0000000000004000
t1 : 0000003ff3e00628 t2 : 0000003ff6f86598 s0 : 00000004f148d652
s1 : 0000003fedb8afd8 a0 : 0000000000000000 a1 : 0000003fed8dd038
a2 : 0000000000000002 a3 : 0000000000000002 a4 : 0000000000000032
a5 : 09e1854e3584b583 a6 : 0000000000000000 a7 : 0000003ff7ffdd50
s2 : 00000004f311b9f8 s3 : 00000004f16d93c8 s4 : 0000003fed8dc5e0
s5 : 00000004f31140c0 s6 : 00000004f148d638 s7 : 00000004f36d2870
s8 : 00000004f2ce1000 s9 : 00000000000003bd s10: 0000003fedb59058
s11: 00000004f31140c0 t3 : 0000003fed2f0514 t4 : 0000000000000001
t5 : 0000003fed9c6e2a t6 : 000000000000002e
status: 8000000200006020 badaddr: 0000004e3584b582 cause: 000000000000000c
I'll try 9.6.4 on a more up-to-date Nixpkgs to see if the issue still happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discovered that the 6.5 kernel has a regression that causes some applications (notably rustc) to segfault. I've upgraded my kernel to 6.6, which doesn't have that issue, but the GHC 9.6.2 I've previously built still segfaults.
(The 9.6.4 build is still ongoing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 9.6.4 build is fully working. I have successfully built ShellCheck and nix-output-monitor with it, and not a single segfault has been seen.
Now I need to try rebuilding 9.6.2 to properly rule out the Linux kernel as the cause and try building 9.2 and 9.4 registerised to see if I can get those to work. This is going to take a lot of time.
So I cross-compiled The host platform's llvm is referenced. Is that even used for an unregisterised build?
Compiling a simple hello world fails because it references the host platform's C compiler
|
Ouch, that is correct.
I can't remember why I didn't fix that when I first wrote the PR, so I'll look into it. |
This is because of the aforementioned issue with I have a (potentially hacky) fix in my |
Propagating the supported platforms of the boot compiler doesn't make much sense when unregistered cross-compilation is possible.
6f791b4
to
3086fe6
Compare
I have no fresh memories of trying older versions, but there are more details in the corresponding GHC issue: https://gitlab.haskell.org/ghc/ghc/-/issues/23957
You're probably right about this, which is why I never noticed that the LLVM in the I tried my earlier example of a working cross-compiled GHC without LLVM in PATH and it still worked. That GHC was built with patched cc-wrapper and binutils-wrapper. |
This is to ensure that Haskell users on platforms that lack official bindists still have a convenient means of getting GHC running natively. In my admittedly somewhat limited testing on RISC-V, GHC 8.10.7 is able to bootstrap native builds for 9.2.8 and 9.4.5. GHC 9.2.8 and 9.4.5 are unable to bootstrap themselves and 9.6.2 when cross-compiled. If you're looking at this commit to see whether you can safely upgrade the compiler used here to remove 8.10, please try cross-compiling 9.0 or later and then booting a native GHC with it.
3086fe6
to
4e0921f
Compare
To test the changes to 9.4.8 and the tentative changes made to 9.6, I've successfully tested the following
This is what I currently care about ;) |
As for the wrappers, looking through the code I can see that the settings file nixpkgs/pkgs/development/compilers/ghc/9.4.8.nix Lines 378 to 389 in 4e0921f
installCC comes from pkgsHostTarget ).
Do I understand correctly that the wrappers used in If this is the case, then this seems a bit orthogonal and I would not block the PR because of this, but I'll let the maintainer decide. Also is the following still true or outdated, at least for the tools patched in the postInstall? nixpkgs/pkgs/development/compilers/ghc/9.4.8.nix Lines 168 to 170 in 08b9151
As mentioned before, at least LLVM seems to be still leaking, so maybe the |
Description of changes
This PR fixes all versions of GHC older than 9.6 to allow cross-compiling GHC itself (build ≠ host = target). In addition to that, it adds an overridable option to force building an unregisterised version.
Changes have also been made to 9.6 and newer, but they are merely tentative because Hadrian cannot currently cross-compile GHC.
Due to bugs in cc-wrapper and binutils-wrapper, cross-compiling GHC still doesn't produce binaries usable without emulation or further fixes, but this will at the very least allow users to port GHC to platforms that Nixpkgs doesn't support natively.
Things done
pkgsCross.aarch64-multiplatform.haskell.compiler.integer-simple.ghc{884,8107}
pkgsCross.aarch64-multiplatform.haskell.compiler.native-bignum.ghc{902,924,925,926,927,928,942,943,944,945}
pkgsCross.riscv64.haskell.compiler.integer-simple.ghc884
pkgsCross.riscv64.haskell.compiler.native-bignum.ghc945
haskell.compiler.native-bignum.ghc945
haskell.compiler.native-bignum.ghc962
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)pkgsCross.aarch64-multiplatform.haskell.compiler.integer-simple.ghc{884,8107}
pkgsCross.aarch64-multiplatform.haskell.compiler.native-bignum.ghc{902,924,925,926,927,928,942,943,944}
pkgsCross.aarch64-multiplatform.haskell.compiler.native-bignum.ghc945
pkgsCross.riscv64.haskell.compiler.integer-simple.ghc884
pkgsCross.riscv64.haskell.compiler.native-bignum.ghc945
haskell.compiler.native-bignum.ghc945
haskell.compiler.native-bignum.ghc962