-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make #[used]
work when linking with ld64
#133832
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
symbols.o
trick work when linking with ld64
#[used]
work when linking with ld64
c6251a8
to
0653e4c
Compare
let (data, align, addend, r_type): (&[u8], _, _, _) = match &*target.arch { | ||
"arm" => (&[0xff, 0xf7, 0xfe, 0xbf], 2, 0, object::macho::ARM_THUMB_RELOC_BR22), | ||
"aarch64" => (&[0, 0, 0, 0x14], 4, 0, object::macho::ARM64_RELOC_BRANCH26), | ||
"x86_64" => ( | ||
&[0x55, 0x48, 0x89, 0xe5, 0x5d, 0xe9, 0, 0, 0, 0], | ||
16, | ||
-4, | ||
object::macho::X86_64_RELOC_BRANCH, | ||
), | ||
"x86" => (&[0x55, 0x89, 0xe5, 0x5d, 0xe9, 0xf7, 0xff, 0xff, 0xff], 16, -4, 0), | ||
arch => unimplemented!("unsupported Apple architecture {arch:?}"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite unhappy with this, it feels very brittle to emit a manual relocation like this, but I think it's the only way to do it (without pulling in a bunch of LLVM codegen internals).
I'm also unsure of the correctness, there might be more target options and/or flags that we should be checking here? Or maybe we should emit different kinds of relocations based on the kind of symbol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... emitting manual relocations like this feels super fragile and hard to maintain. At the very least this would need comments and ideally references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think additional comments are needed here, except to explain what this data represents, such as [0, 0, 0, 0x14]
represents the b
instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a bit concern about here, the symbol of #[used]
is a global variable, the C code should be void foo(void);
. I'm not sure what can happen here. I prefer an address offset instruction. This is more safe to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to add a local symbol to help ld64 strip these dead code.
$ objdump -d target/release/divan-example | head -n 10
target/release/divan-example: file format mach-o arm64
Disassembly of section __TEXT,__text:
000000010000273c <__text>:
10000273c: 1400000a b 0x100002764 <_main>
100002740: 14016d1e b 0x10005dbb8 <__ZN3std9panicking11EMPTY_PANIC17hd7edc6c0e88a95c9E>
100002744: 1400eca4 b 0x10003d9d4 <_rust_eh_personality>
100002748: 14016632 b 0x10005c010 <__ZN13divan_example4PUSH17h56090c2dad718183E>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestions here sounds correct to me, I've also later been thinking that an address offset instruction would be safer than a jump (since address offsets can be done safely on both function pointers and data pointers).
You may want to add a local symbol
I'm not sure I understand what you mean by that? You want a local symbol that appears as the "entry" point for the section? Does that help the linker see that the code is unused?
offset, | ||
addend, | ||
symbol, | ||
flags: object::write::RelocationFlags::MachO { r_type, r_pcrel: true, r_length: 2 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we could outsource part of this to object
by using object::write::RelocationFlags::Generic
, but we'd still have to write the machine code ourselves.
This comment has been minimized.
This comment has been minimized.
0653e4c
to
78a1c2d
Compare
@estebank I'm gonna mark this as ready for an initial review, to make sure that you agree that the approach here is the right one (vs. the alternatives I noted in the PR description / another alternative that I haven't considered). If so, then I'll complete the remaining TODO items. |
This PR modifies cc @jieyouxu |
I'm sorry. I don't think I'm the best person to review this change. r? compiler |
I'm going to ask in T-compiler since this is likely to bounce based on random rerolls. I'll give it a few days, if we still can't find a suitable reviewer by then, I'll compiler-nominate this PR. Please ping me next Monday if we still haven't found a suitable review (in case this gets lost in my review queue), I'll nominate it for next week's triage meeting. |
Edit: I need more time to check it. I'm not the best person here, but fewer people are using macOS here. I will try to review it tomorrow. |
IIUC, I can reproduce this with C: // foo.c
__attribute__((constructor))
void foo() {}
// main.c
int main() {
return 0;
} When I build the I'm concerned about the current implementation. It seems we're building our logic around ld64's internal implementation details, and ld64 isn't truly open source. (I haven't looked into ld64's implementation details yet. I'm not sure if this approach is fragile or will be difficult to maintain in the future.) IIUC, the key here is to have the linker load object files as intended, just like when we directly use If we don't want to do this, how about adding the right value of |
// and hence this is all very architecture-specific. | ||
// | ||
// See the following for a few details on the design of ld64: | ||
// https://github.com/apple-oss-distributions/ld64/blob/ld64-951.9/doc/design/linker.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link doesn't seem to mention the details of how ld64 resolves undefined symbols, especially regarding relocations. Should we add links to the relevant code locations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was the closest I could find, but actual links to the code would be a good idea. I'll try to find the relevant parts when I get the time.
Note though that I think the new linker, -ld_prime
, is not open source? So this will only be a weak reference in any case.
That's correct.
I agree that it is fragile, but I don't think it is as bad as it looks; consider that linkers are mostly backwards compatible, and that it works now (whereas before it just didn't work at all, in no versions of ld64). I honestly don't think Apple is gonna change how this works, and if they do, we'll have the Xcode betas to fix it. The only thing I can conceivably think of that might break is:
I guess if we wanted to make it even more robust, we'd do something like: // symbols.rs
extern crate crate1;
extern crate crate2;
extern crate crate3;
fn _used_symbols() {
let _static1 = crate1::STATIC1;
let _static2 = crate2::STATIC2;
let _fn3 = crate3::fn3;
}
// rustc symbols.rs --emit=obj That is, emit a label that refers to the block of code that touches all symbols, such that the linker cannot assume the section to be unused. The roughly equivalent could be done with file.add_symbol(write::Symbol {
name: "_used_symbols".into(),
value: 0,
size: 0,
kind: SymbolKind::Text,
scope: SymbolScope::Dynamic,
weak: false,
section: write::SymbolSection::Section(section_id),
flags: SymbolFlags::None,
}); Was that clear? I can try to go in more detail here if you want? Or try to conjure up some contrived assembly code, and consider how ld64 would have to link that now and in the future?
I can see two ways to do that:
The former is bad for link-time performance (the linker can skip a lot of work for archives that if can't for object files), and the latter is either also bad for perf, or makes the integration between
Hmm, not sure I understand? Note that we'd still want e.g. |
The linker would have still pulled in the object file by that time. And |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we'd still want e.g.
#[used] #[link_section = "..."] static FOO: i32 = 2;
to be seen by the linker, and here there's no "inner" symbol to refer to?
Ah, yeah! This also reminder me of something. Consider the following C code:
__attribute__((constructor)) static void push() {}
or
static void push() {}
#if defined(__linux__)
__attribute__((section(".init_array")))
#elif defined(__APPLE__)
__attribute__((section("__DATA,__mod_init_func,mod_init_funcs")))
#endif
static void (*const PUSH)(void) = push;
There are no global symbols here. I think rustc should not modify local symbols into global symbols through #[used]
- in this case, it's the same situation as the C code above, where an object file has no global symbols.
IMO, loading object files through a "symbols.o" doesn't seem like a good idea. Looking at the longer term, I don't think we should continue to go this way.
Looking at this PR itself, I don't think we need to focus on ld64's internal implementation, since we're trying to simulate a standard object file. Based on my newly added comments, I think this approach is still quite complicated, and we need more workarounds to resolve it. What do you think?
let (data, align, addend, r_type): (&[u8], _, _, _) = match &*target.arch { | ||
"arm" => (&[0xff, 0xf7, 0xfe, 0xbf], 2, 0, object::macho::ARM_THUMB_RELOC_BR22), | ||
"aarch64" => (&[0, 0, 0, 0x14], 4, 0, object::macho::ARM64_RELOC_BRANCH26), | ||
"x86_64" => ( | ||
&[0x55, 0x48, 0x89, 0xe5, 0x5d, 0xe9, 0, 0, 0, 0], | ||
16, | ||
-4, | ||
object::macho::X86_64_RELOC_BRANCH, | ||
), | ||
"x86" => (&[0x55, 0x89, 0xe5, 0x5d, 0xe9, 0xf7, 0xff, 0xff, 0xff], 16, -4, 0), | ||
arch => unimplemented!("unsupported Apple architecture {arch:?}"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think additional comments are needed here, except to explain what this data represents, such as [0, 0, 0, 0x14]
represents the b
instruction.
let (data, align, addend, r_type): (&[u8], _, _, _) = match &*target.arch { | ||
"arm" => (&[0xff, 0xf7, 0xfe, 0xbf], 2, 0, object::macho::ARM_THUMB_RELOC_BR22), | ||
"aarch64" => (&[0, 0, 0, 0x14], 4, 0, object::macho::ARM64_RELOC_BRANCH26), | ||
"x86_64" => ( | ||
&[0x55, 0x48, 0x89, 0xe5, 0x5d, 0xe9, 0, 0, 0, 0], | ||
16, | ||
-4, | ||
object::macho::X86_64_RELOC_BRANCH, | ||
), | ||
"x86" => (&[0x55, 0x89, 0xe5, 0x5d, 0xe9, 0xf7, 0xff, 0xff, 0xff], 16, -4, 0), | ||
arch => unimplemented!("unsupported Apple architecture {arch:?}"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a bit concern about here, the symbol of #[used]
is a global variable, the C code should be void foo(void);
. I'm not sure what can happen here. I prefer an address offset instruction. This is more safe to me.
let (data, align, addend, r_type): (&[u8], _, _, _) = match &*target.arch { | ||
"arm" => (&[0xff, 0xf7, 0xfe, 0xbf], 2, 0, object::macho::ARM_THUMB_RELOC_BR22), | ||
"aarch64" => (&[0, 0, 0, 0x14], 4, 0, object::macho::ARM64_RELOC_BRANCH26), | ||
"x86_64" => ( | ||
&[0x55, 0x48, 0x89, 0xe5, 0x5d, 0xe9, 0, 0, 0, 0], | ||
16, | ||
-4, | ||
object::macho::X86_64_RELOC_BRANCH, | ||
), | ||
"x86" => (&[0x55, 0x89, 0xe5, 0x5d, 0xe9, 0xf7, 0xff, 0xff, 0xff], 16, -4, 0), | ||
arch => unimplemented!("unsupported Apple architecture {arch:?}"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to add a local symbol to help ld64 strip these dead code.
$ objdump -d target/release/divan-example | head -n 10
target/release/divan-example: file format mach-o arm64
Disassembly of section __TEXT,__text:
000000010000273c <__text>:
10000273c: 1400000a b 0x100002764 <_main>
100002740: 14016d1e b 0x10005dbb8 <__ZN3std9panicking11EMPTY_PANIC17hd7edc6c0e88a95c9E>
100002744: 1400eca4 b 0x10003d9d4 <_rust_eh_personality>
100002748: 14016632 b 0x10005c010 <__ZN13divan_example4PUSH17h56090c2dad718183E>
Hmm, I think I see what you mean, but is it really an issue? My understanding is vague here, but I think global vs. local symbols are linker-only concepts, are they not? Perhaps things differ when it comes to dynamic libraries, but I don't the symbol is marked as "exported" (?) , so the linker should not treat things any differently there either?
I'm not sure I understand, how do you propose we solve this instead, then? In a way, |
(technically |
As @bjorn3 described,
My current thinking is to treat this PR as a working solution for now, while loading direct object file is a long-term improvement goal. Besides the test case, could you address my two concerns above?
|
Not even kept by the linker. Just kept long enough for the linker to be able to see it at all. The linker is free to discard the symbol if it wants to. It just has to see it such that it can make the decision whether to discard it or not. This makes it pretty much useless for any purpose other than global constructors. Even the section merging for distributed slices is broken with lld because of this and requires |
Ah, cool! Thanks for the review and discussion so far, btw!
Certainly (though it might take a little while for me to get back into it)!
I get your point that it might be pretty much useless on its own, but you can very much make it useful by adding |
Adding such a flag to the symbol is what the unstable |
To make
#[used]
work in static libraries, we use thesymbols.o
trick introduced in #95604.However, the linker shipped with Xcode, ld64, works a bit differently from other linkers; in particular, it completely ignores undefined symbols by themselves, and only consider them if they have relocations (something something atoms something fixups, I don't know the details).
So to make the
symbols.o
file work on ld64, we need to actually insert a relocation. That's kinda cumbersome to do though, since the relocation must be valid, and hence must point to a valid piece of machine code, and is hence very architecture-specific.Fixes #133491, see that for investigation.
Another option would be to pass
-u _foo
to the final linker invocation. This has the problem that-u
causes the linker to not be able to dead-strip the symbol, which is undesirable. (If we did this, we would possibly also want to do it by putting the arguments in a file by itself, and passing that file via@
, e.g.@undefined_symbols.txt
, similar to #52699, though that is only supported since Xcode 12, and I'm not sure we wanna bump that).Various other options that are probably all undesirable as they affect link time performance:
-all_load
to the linker.-ObjC
to the linker (the Objective-C support in the linker has different code paths that load more of the binary), and instrument the binaries that contain#[used]
symbols.-force_load
to libraries that contain#[used]
symbols.Failed attempt: Embed
-u _foo
in the object file withLC_LINKER_OPTION
, akin to #121293. Doesn't work, both becauseld64
doesn't read that from archive members unless it already has a reason to load the member (which is what this PR is trying to make it do), and becauseld64
only support the-l
,-needed-l
,-framework
and-needed_framework
flags in there.TODO:
@rustbot label O-apple