You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
davidlattimore opened this issue
Mar 4, 2021
· 0 comments
Labels
I-heavyIssue: Problems and improvements with respect to binary size of generated code.O-ArmTarget: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state
I've observed an unexpected increase in binary size in response to a change in a crate that we use. The change only adds new public methods, which we don't call, so all the changed code is effectively dead code, but still it results in a significant increase in our binary size. My guess is that the presence of this new code causes LLVM to make different inlining decisions, even though the new code isn't actually called anywhere.
This happens on 1.50.0. The increase (for a minimal binary included below) is from 932 bytes to 2164 bytes.
Switching from 1.50 to 1.51 (currently in beta) without the above change causes the same increase from 932 bytes to 2164 bytes.
I was going to mark this as a stable to beta regression, but TBH, I think it's probably a pre-existing issue that just triggers in response to legitimate changes in library code. I expect that whatever changed between 1.50 and 1.51 is similar in nature to the code change above.
I've tarred up a moderately minimal bit of code that reproduces this:
To reproduce, run the ./check-size script contained within the tarball. You might need to rustup target install thumbv6m-none-eabi first.
For me, with current stable 1.50, this shows a change in binary size from 932 bytes to 2164 bytes:
Size prior to commit 77dace37908f281feb9432fc13874475d9dc0765
-rwxr-xr-x 1 dml eng 932 Mar 4 16:39 a.bin
Size after commit 77dace37908f281feb9432fc13874475d9dc0765
-rwxr-xr-x 1 dml eng 2164 Mar 4 16:39 b.bin
If I adjust the script to use 1.51, then I get 2164 bytes for both.
Looking at the disassembly of each binary, it seems that the larger binary includes compiler_builtins::int::specialized_div_rem::u64_div_rem, where the smaller binary doesn't. u64_div_rem is called from __udivmoddi4, which is called from __aeabi_uldivmod. These are also absent from the smaller binary, but present and called from MicroSecond::cycles / Delay::delay in the larger binary.
Cargo.toml sets opt-level = "s". Similar results are observed with opt-level = "z".
Given that LTO is enabled, I'd have expected that dead code would be removed before inlining decisions were made, so I'm surprised that a change to code that isn't called would have this effect.
If there's anything we can do to help LLVM make more optimal decisions when optimizing for binary size, that'd be awesome, although I'm sure it's a pretty difficult problem.
The text was updated successfully, but these errors were encountered:
JohnTitor
added
I-heavy
Issue: Problems and improvements with respect to binary size of generated code.
O-Arm
Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state
labels
Mar 4, 2021
I-heavyIssue: Problems and improvements with respect to binary size of generated code.O-ArmTarget: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state
I've observed an unexpected increase in binary size in response to a change in a crate that we use. The change only adds new public methods, which we don't call, so all the changed code is effectively dead code, but still it results in a significant increase in our binary size. My guess is that the presence of this new code causes LLVM to make different inlining decisions, even though the new code isn't actually called anywhere.
This happens on 1.50.0. The increase (for a minimal binary included below) is from 932 bytes to 2164 bytes.
Switching from 1.50 to 1.51 (currently in beta) without the above change causes the same increase from 932 bytes to 2164 bytes.
I was going to mark this as a stable to beta regression, but TBH, I think it's probably a pre-existing issue that just triggers in response to legitimate changes in library code. I expect that whatever changed between 1.50 and 1.51 is similar in nature to the code change above.
I've tarred up a moderately minimal bit of code that reproduces this:
binary-size-increase.tar.gz
To reproduce, run the
./check-size
script contained within the tarball. You might need torustup target install thumbv6m-none-eabi
first.For me, with current stable 1.50, this shows a change in binary size from 932 bytes to 2164 bytes:
If I adjust the script to use 1.51, then I get 2164 bytes for both.
Looking at the disassembly of each binary, it seems that the larger binary includes
compiler_builtins::int::specialized_div_rem::u64_div_rem
, where the smaller binary doesn't.u64_div_rem
is called from__udivmoddi4
, which is called from__aeabi_uldivmod
. These are also absent from the smaller binary, but present and called fromMicroSecond::cycles
/Delay::delay
in the larger binary.Cargo.toml sets opt-level = "s". Similar results are observed with opt-level = "z".
Given that LTO is enabled, I'd have expected that dead code would be removed before inlining decisions were made, so I'm surprised that a change to code that isn't called would have this effect.
If there's anything we can do to help LLVM make more optimal decisions when optimizing for binary size, that'd be awesome, although I'm sure it's a pretty difficult problem.
The text was updated successfully, but these errors were encountered: