-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trans: Avoid weak linkage for closures when linking with MinGW. #34830
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
I'd personally prefer to avoid a 3.2MB test case if possible, but I can definitely see how it'd be quite nice to have a regression test for this. Could something like this suffice though for generating a ton of closures? macro_rules! a {
($mac:ident) => ($mac!());
($mac:ident 1 $($t:tt)*) => (
a!($mac $($t)*);
a!($mac $($t)*);
)
}
fn main() {
macro_rules! doit {
() => (let _x = |a: u32| a + 4;)
}
a!(doit 1 1 1 1 1 1 1 1 1 1);
} That should generate 1k closures on the stack I believe, although do the closure bodies have to all be different? |
What? Every self-respecting project of a certain size has to have one of those!
Cool! I'll try to adapt it so it doesn't generate all closures in the same function. I've tried that before and it was really bad for compile times. |
dc9427b
to
d5ef24e
Compare
OK, I've updated the test case. Much better now |
🏋 |
Is this symbol count restriction per link-session or per-object/crate/library? |
@michaelwoerister You should update the PR description as well (about the testcase generation). |
@nagisa |
@eddyb Done. |
I’m figuring that if we wanted to go crazy, we could, then, keep a counter per-object which would assign weak_odr linkage for first 2^16 candidates and then make everything else internal, but I realised it is way too much effort for what it provides. |
Yeah, that might prevent the problem at the input level, but then we would probably run into the restriction for the output binary because we are asking |
…=eddyb Run base::internalize_symbols() even for single-codegen-unit crates. The initial linkage-assignment (especially for closures) is a conservative one that makes some symbols more visible than they need to be. While this is not a correctness problem, it does force the LLVM inliner to be more conservative too, which results in poor performance. Once translation is based solely on MIR, it will be easier to also make the initial linkage assignment a better fitting one. Until then `internalize_symbols()` does a good job of preventing most performance regressions. This should solve the regressions reported in #34891 and maybe also those in #34831. As a side-effect, this will also solve most of the problematic cases described in #34793. Not reliably so, however. For that, we still need a solution like the one implement in #34830. cc @rust-lang/compiler
I am OK with this setup. |
@bors r+ |
📌 Commit d5ef24e has been approved by |
⌛ Testing commit d5ef24e with merge 2fded70... |
💔 Test failed - auto-win-msvc-64-opt-rustbuild |
Well that doesn't sound very good. |
@bors: retry |
⌛ Testing commit d5ef24e with merge 3a651d8... |
💔 Test failed - auto-win-msvc-64-opt |
d5ef24e
to
eaea4ac
Compare
@bors r=nikomatsakis This was a legitimate failure of a test case that had been added after the PR was made. Should be fixed. |
📌 Commit eaea4ac has been approved by |
…akis trans: Avoid weak linkage for closures when linking with MinGW. This PR proposes one possible solution to #34793, the problem that prevents servo/servo#12393 from landing. It applies the same strategy, that we already use for monomorphizations, to closures, that is, instead of emitting symbols with `weak_odr` linkage in order to avoid symbol conflicts, we emit them with `internal` linkage, with the side effect that we have to copy code instead of just linking to it, if more than one codegen unit is involved. With this PR, the compiler will only apply this strategy for targets where we would actually run into a problem when using `weak_odr` linkage, in other words nothing will change for platforms except for MinGW. The solution implemented here has one restriction that could be lifted with some more effort, but it does not seem to be worth the trouble since it will go away once we use only MIR-trans: If someone compiles code 1. on MinGW, 2. with more than one codegen unit, 3. *not* using MIR-trans, 4. and runs into a closure inlined from another crate then the compiler will abort and suggest to compile either with just one codegen unit or `-Zorbit`. What's nice about this is that I lays a foundation for also doing the same for generics: using weak linkage where possible and thus enabling some more space optimizations that the linker can do. ~~This PR also contains a test case for compiling a program that contains more than 2^15 closures. It's a huge, generated file with almost 100K LOCs. I did not commit the script for generating the file but could do so. Alternatively, maybe someone wants to come up with a way of doing this with macros.~~ The test file is implemented via macros now (thanks @alexcrichton!) Opinions? Fixes #34793. cc @rust-lang/compiler
This PR proposes one possible solution to #34793, the problem that prevents servo/servo#12393 from landing. It applies the same strategy, that we already use for monomorphizations, to closures, that is, instead of emitting symbols with
weak_odr
linkage in order to avoid symbol conflicts, we emit them withinternal
linkage, with the side effect that we have to copy code instead of just linking to it, if more than one codegen unit is involved.With this PR, the compiler will only apply this strategy for targets where we would actually run into a problem when using
weak_odr
linkage, in other words nothing will change for platforms except for MinGW.The solution implemented here has one restriction that could be lifted with some more effort, but it does not seem to be worth the trouble since it will go away once we use only MIR-trans: If someone compiles code
then the compiler will abort and suggest to compile either with just one codegen unit or
-Zorbit
.What's nice about this is that I lays a foundation for also doing the same for generics: using weak linkage where possible and thus enabling some more space optimizations that the linker can do.
This PR also contains a test case for compiling a program that contains more than 2^15 closures. It's a huge, generated file with almost 100K LOCs. I did not commit the script for generating the file but could do so. Alternatively, maybe someone wants to come up with a way of doing this with macros.The test file is implemented via macros now (thanks @alexcrichton!)
Opinions?
Fixes #34793.
cc @rust-lang/compiler