Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all: binaries too big and growing #6853

Open
robpike opened this issue Nov 30, 2013 · 162 comments
Open

all: binaries too big and growing #6853

robpike opened this issue Nov 30, 2013 · 162 comments
Labels
binary-size NeedsFix The path to resolution is known, but the work has not been done. umbrella
Milestone

Comments

@robpike
Copy link
Contributor

robpike commented Nov 30, 2013

As an experiment, I build "hello, world" at the release points for go 1.0.
1.1, and 1.2. Here are the binary's sizes:

% ls -l x.1.?
-rwxr-xr-x  1 r  staff  1191952 Nov 30 10:25 x.1.0
-rwxr-xr-x  1 r  staff  1525936 Nov 30 10:20 x.1.1
-rwxr-xr-x  1 r  staff  2188576 Nov 30 10:18 x.1.2
% size x.1.?
__TEXT  __DATA  __OBJC  others  dec hex
880640  33682096    0   4112    34566848    20f72c0 x.1.0
1064960 94656   0   75952   1235568 12da70  x.1.1
1429504 147896  0   177440  1754840 1ac6d8  x.1.2
% 

A near-doubling of the binary size in two releases is a bug of a kind. I will hold on to
the files so they can be analyzed more, but am filing this issue to get the topic
registered. We need to develop a better understanding of the problem and how to address
it.

Marking this 1.3 (not maybe) because I consider it a priority.


A few months ago I exchanged mail with Russ about this topic regarding a different, much
larger binary. To avoid him having to redo the analysis, here is what he said at the
time:

====
i sent CL 13722046 to make the nm -S output a bit more useful.
for the toy binary i now get

  4a2280  1898528 D symtab
  26f3a0  1405936 D type.*
  671aa0  1058432 D pclntab
  3c6790   598056 D go.string.*
  4620c0    49600 D gcbss
  7a7c20    45496 B runtime.mheap
  46e280    21936 D gcdata
  7a29e0    21056 b bufferList
  1ed600    16480 T crypto/tls.(*Conn).clientHandshake
  79eb20    16064 b semtable
  1b3d90    14224 T net/http.init

that seems plausible to me. some notes:

symtab is the plan 9 symbol table. it in the binary but never referenced at run time. it
supports things like nm -S only. it needs to move into an unmapped section of the
binary, but it is only costing at most 8k at run time right now due to fragmentation and
it just wasn't worth the effort to try to move. the new linker will make this easier. of
course, moving it in the file doesn't shrink the file.

the thing named pclntab is a reencoding of the original pclntab and the parts of the
plan 9 symbol table that we did need at run time (mostly just a list of functions and
their names and addresses). as you can see, it is much smaller than the old form (the
symbol table dominates).

type.* is the reflect types and go.string.* is the static go string data. the *
indicates that i coalesced many symbols into one, to avoid useless individual names
bloating the symbol table. if we tried we could probably cut the reflect types by 2-4x.
it would mean packing the data a bit more compactly than an ordinary go data structure
would and then using unsafe to get it back out.

gcbss and gcdata are garbage collection bits for the bss and data segments. that's what
atom symbol did, and it's not clear whether it will last (probably not) and whether what
will replace it will be smaller. time will tell. i have a meeting with dmitriy, carl,
and keith next week to figure out what the plan is.

runtime.mheap, bufferList, and semtable are bss.

you're not seeing the gdb dwarf debug information here, because it's not a runtime
symbol.
 
g% otool -l $(which toy) | egrep '^  segname|filesize'
  segname __PAGEZERO
 filesize 0
  segname __TEXT
 filesize 7811072
  segname __DATA
 filesize 126560
  segname __LINKEDIT
 filesize 921772
  segname __DWARF
 filesize 2886943
g% 

there's another 3 MB. you can build with -ldflags -w to get rid of that at least.
if you read the full otool -l output you will find

Load command 6
     cmd LC_SYMTAB
 cmdsize 24
  symoff 10825728
   nsyms 22559
  stroff 11186924
 strsize 560576

looks like another 1 MB or so (560576+11186924-10825728 or 22559*16+560576) for the
mach-o symbol table.

when we do the new linker we can make recording this kind of information in a useful
form a priority.
@robpike
Copy link
Contributor Author

robpike commented Nov 30, 2013

Comment 1:

Note: the binaries were build on amd64 10.7.5 (Lion), with  gcc -version
i686-apple-darwin11-llvm-gcc-4.2
For the record, I couldn't do this experiment on 10.9 with Xcode 5 because the older
releases wouldn't build due to gcc/clang skew.

@robpike
Copy link
Contributor Author

robpike commented Nov 30, 2013

Comment 2:

Labels changed: added priority-later, removed priority-triage.

@gopherbot
Copy link
Contributor

Comment 3 by jlourenco27:

Just for added reference, this are the size on go 1.1.2 vs 1.2 with OS X 10.9 and Xcode
5 (darwin gcc llvm 5.0 x86_64):
$ ls -l *1.*
-rwxr-xr-x  1 j      staff  1525984  2 Dez 21:44 hello_1.1.2
-rwxr-xr-x  1 j      staff  2192672  2 Dez 21:40 hello_1.2
$ size *1.*
__TEXT  __DATA  __OBJC  others  dec hex
1064960 94720   0   76000   1235680 12dae0  hello_1.1.2
1433600 147896  0   177440  1758936 1ad6d8  hello_1.2

@randall77
Copy link
Contributor

Comment 4:

This issue was updated by revision f238049.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/35940047

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 5:

Labels changed: added release-go1.3.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 6:

Labels changed: removed go1.3.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 7:

Labels changed: added repo-main.

@zephyr
Copy link

zephyr commented Dec 7, 2013

Comment 8:

In this context, please reevaluate if the golang binaries can be changed to work with
UPX (ultimate packer for executables).¹²
For a small amount of computing power, upx can reduce the size of a binary down to a
quarter of its original size. You can authenticate this using a existing example
›fixer‹ programm for golang binaries on linux/amd64.³
While this approach doesn't fix the root of the problem – only the symptoms – it
would be nice to have this possibility always on hand.
For technical background of this problem (PT_LOAD[0].p_offset==0), please look at the
UPX bugtracker⁴.
¹ http://upx.sourceforge.net/
² https://en.wikipedia.org/wiki/UPX
³ https://github.com/pwaller/goupxhttp://sourceforge.net/p/upx/bugs/195/

@minux
Copy link
Member

minux commented Dec 7, 2013

Comment 9:

re #8, I don't think it's Go's problem. upx should be made more flexible to handle this.

@leo-liu
Copy link

leo-liu commented Dec 8, 2013

Comment 10:

minux:
We may figure out why the binaries are growing fast first (lager runtime, optimization,
etc.), before we claim that it isn't Go's problem.

@minux
Copy link
Member

minux commented Dec 8, 2013

Comment 11:

re #10, my #9 reply is to #8, which is about an entirely different problem.
i'm not saying that the ever-growing binaries is not our problem, only that i
don't believe that upx not accepting our binaries is our problem.
it's clear that upx isn't able to handle all possible and correct ELF files (i.e.
if the kernel can execute our binaries just fine, it's upx's problem to not be
able to compress them).

@robpike
Copy link
Contributor Author

robpike commented Feb 19, 2014

Comment 12:

More detail. The Plan 9 symbol table is about to be deleted. Here is a reference point,
adding one new entry to the list above:
$ ls -l *1.*
-rwxr-xr-x  1 j      staff  1525984  2 Dez 21:44 hello_1.1.2
-rwxr-xr-x  1 j      staff  2192672  2 Dez 21:40 hello_1.2
-rwxr-xr-x  1 j      staff  2474512 Feb 18 20:27 hello_1.2.x
$ size *1.*
__TEXT  __DATA  __OBJC  others  dec hex
1064960 94720   0   76000   1235680 12dae0  hello_1.1.2
1433600 147896  0   177440  1758936 1ad6d8  hello_1.2
1699840 160984  0   188944  2049768 1f46e8     hello_1.2.x
Text has grown substantially, as has data. At least some of this is due to new
annotations for the garbage collector.

@robpike
Copy link
Contributor Author

robpike commented Feb 19, 2014

Comment 13:

More detail. The Plan 9 symbol table is about to be deleted. Here is a reference point,
adding one new entry to the list above:
$ ls -l *1.*
-rwxr-xr-x  1 r  staff  1191952 Nov 30 10:25 x.1.0
-rwxr-xr-x  1 r  staff  1525936 Nov 30 10:20 x.1.1
-rwxr-xr-x  1 r  staff  2188576 Nov 30 10:18 x.1.2
-rwxr-xr-x  1 r  staff  2474512 Feb 18 20:27 hello_1.2.x
$ size *1.*
__TEXT  __DATA  __OBJC  others  dec hex
880640  33682096     0  4112    34566848     20f72c0    x.1.0
1064960 94656   0   75952   1235568 12da70  x.1.1
1429504 147896  0   177440  1754840 1ac6d8  x.1.2
1699840 160984  0   188944  2049768 1f46e8     hello_1.2.x
Text has grown substantially, as has data. At least some of this is due to new
annotations for the garbage collector.

@robpike
Copy link
Contributor Author

robpike commented Feb 19, 2014

Comment 14:

More detail. The Plan 9 symbol table is about to be deleted. Here is a reference point,
adding one new entry to the list above:
$ ls -l *1.*
-rwxr-xr-x  1 r  staff  1191952 Nov 30 10:25 x.1.0
-rwxr-xr-x  1 r  staff  1525936 Nov 30 10:20 x.1.1
-rwxr-xr-x  1 r  staff  2188576 Nov 30 10:18 x.1.2
-rwxr-xr-x  1 r  staff  2474512 Feb 18 20:27 hello_1.2.x
$ size *1.*
__TEXT  __DATA  __OBJC  others  dec hex
880640  33682096     0  4112    34566848     20f72c0    x.1.0
1064960 94656   0   75952   1235568 12da70  x.1.1
1429504 147896  0   177440  1754840 1ac6d8  x.1.2
1699840 160984  0   188944  2049768 1f46e8     x.1.2.x
Text has grown substantially, as has data. At least some of this is due to new
annotations for the garbage collector.

@rsc
Copy link
Contributor

rsc commented Feb 19, 2014

Comment 15:

This issue was updated by revision 964f6d3.

Nothing reads the Plan 9 symbol table anymore.
The last holdout was 'go tool nm', but since being rewritten in Go
it uses the standard symbol table for the binary format
(ELF, Mach-O, PE) instead.
Removing the Plan 9 symbol table saves ~15% disk space
on most binaries.
Two supporting changes included in this CL:
debug/gosym: use Go 1.2 pclntab to synthesize func-only
symbol table when there is no Plan 9 symbol table
debug/elf, debug/macho, debug/pe: ignore final EOF from ReadAt
LGTM=r
R=r, bradfitz
CC=golang-codereviews
https://golang.org/cl/65740045

@bradfitz
Copy link
Contributor

Comment 16:

After revision 737767dd81fd, I see a 25% reduction:
Before:
-rwxr-xr-x  1 bradfitz  staff  23556028 Feb 18 20:47 bin/camlistored
After:
-rwxr-xr-x  1 bradfitz  staff  17727420 Feb 18 20:48 bin/camlistored

@robpike
Copy link
Contributor Author

robpike commented Feb 19, 2014

Comment 17:

For my test case before/after deleting the Plan 9 symbol table:
% ls -l ...
-rwxr-xr-x  1 r  staff  2474512 Feb 18 20:27 hello_1.2.x
-rwxr-xr-x  1 r  staff  2150928 Feb 18 22:28 hello_1.2.y
% size ...
__TEXT  __DATA  __OBJC  others  dec hex
1699840 160984  0   188944  2049768 1f46e8     hello_1.2.x
1376256 160984  0   188944  1726184 1a56e8    hello_1.2.x
% 
So deleting the Plan 9 symbol table pretty close to exactly compensates for the GC
information. We're back at Go 1.2 levels, still far too large but it's a start.

@rsc
Copy link
Contributor

rsc commented Feb 19, 2014

Comment 18:

This issue was updated by revision 2541cc8.

Every function now has a gcargs and gclocals symbol
holding associated garbage collection information.
Put them all in the same meta-symbol as the go.func data
and then drop individual entries from symbol table.
Removing gcargs and gclocals reduces the size of a
typical binary by 10%.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/65870044

@rsc
Copy link
Contributor

rsc commented Feb 19, 2014

Comment 19:

This issue was updated by revision ae38b03.

For an ephemeral binary - one created, run, and then deleted -
there is no need to write dwarf debug information, since the
binary will not be used with gdb. In this case, instruct the linker
not to spend time and disk space generating the debug information
by passing the -w flag to the linker.
Omitting dwarf information reduces the size of most binaries by 25%.
We may be more aggressive about this in the future.
LGTM=bradfitz, r
R=r, bradfitz
CC=golang-codereviews
https://golang.org/cl/65890043

@robpike
Copy link
Contributor Author

robpike commented Feb 19, 2014

Comment 20:

After removing gcargs from the symbol table (stepping across CL 65870044)
% ls -l x.1.2.[yz]
-rwxr-xr-x  1 r  staff  2150928 Feb 18 22:28 hello_1.2.y
-rwxr-xr-x  1 r  staff  1932880 Feb 19 08:14 hello_1.2.z
% size x.1.2.[yz] 
__TEXT  __DATA  __OBJC  others  dec hex
1376256 160984  0   188944  1726184 1a56e8    hello_1.2.y
1376256 160984  0   110160  1647400 192328 hello_1.2.z
% 
It's now smaller than at 1.2 but still much bigger than 1.1, let alone 1.0.

@gopherbot
Copy link
Contributor

Comment 21:

I would like to take a look at compressing pclntab. Is it compressible?

@ianlancetaylor
Copy link
Member

Comment 22:

The pclntab data should be reasonably compact, and note that fast access to the data is
important, since it is used for runtime.Callers and friends.  If you can find a
significant reduction in size that would be great, but small tweaks are probably not
desirable at this point.

@gopherbot
Copy link
Contributor

Comment 23 by fuzxxl:

Is there any documentation for what the pclntab contains and for what constraints its
data structure must fullfill?

@ianlancetaylor
Copy link
Member

Comment 24:

Let's not use this issue as a discussion list.  Please ask questions on golang-dev. 
Thanks.

@gopherbot
Copy link
Contributor

Comment 25 by allard.guy.m:

From the peanut gallery, AFAICT this breaks pprof interactive command 'list'.  At tip I
get, e.g.:
(pprof) list runner
Total: 6424 samples
objdump: syminit: Success
no filename found in main.runner<400c40>
Which works as expected with 1.2.1.

@rsc
Copy link
Contributor

rsc commented Apr 3, 2014

Comment 26:

Let's not use this issue as a discussion list.  Please ask questions on golang-dev. 
Thanks.
pprof not working is issue #7452.

Labels changed: added restrict-addissuecomment-commit.

@rsc
Copy link
Contributor

rsc commented Apr 3, 2014

Comment 27:

This is as fixed as it is going to be for Go 1.3.
Right now at tip + CL 80370045 on darwin/amd64, compiling this program:
package main
import "fmt"
func main() {
    fmt.Println("hello, world")
}
I get 1830352 bytes for the binary. Assuming this is the same case for which Rob's
numbers are reported, by this metric Go 1.3 will roll back more than half the size
increase caused by Go 1.2 (relative to Go 1.1). Will leave further improvement for Go
1.4.

Labels changed: added release-go1.4, removed release-go1.3.

@rsc
Copy link
Contributor

rsc commented Apr 3, 2014

Comment 28:

This issue was updated by revision a26c01a.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80370045

@rsc
Copy link
Contributor

rsc commented Sep 15, 2014

Comment 29:

Labels changed: added release-go1.5, removed release-go1.4.

@clausecker
Copy link

I thought at first that restricting dead-code elimination to only those packages that are not imported by other packages that invoke reflect might work, but of course reflect can also be used through wrapper libraries or even function pointers getting method names from who knows where. It's really not easy.

@josharian
Copy link
Contributor

@egonelbre

I'm not following why that approach would be questionable -- seems completely reasonable to me.

Seems fragile. Do all distros use the same string? Gets the wrong answer with a class of (absurdly named) executables.

I guess the current approach is also arguably fragile in that you could lose an init vs delete race.

Feel free to send a CL if you want. :) Note that you have to manually inline strings.TrimSuffix.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/311790 mentions this issue: os: reduce binary size

@egonelbre
Copy link
Contributor

@josharian ah, gotcha... I didn't read the comment near executablePath. Yes, then it makes sense why it's fragile.

I'm not sure whether it'll get accepted or not, but I sent the CL anyways.

gopherbot pushed a commit that referenced this issue Apr 22, 2021
Currently Readlink gets linked into the binary even when Executable is
not needed.

This reduces a simple "os.Stdout.Write([]byte("hello"))" by ~10KiB.

Previously the executable path was read during init time, because
deleting the executable would make "Readlink" return "(deleted)" suffix.
There's probably a slight chance that the init time reading would return
it anyways.

Updates #6853

Change-Id: Ic76190c5b64d9320ceb489cd6a553108614653d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/311790
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Tobias Klauser <tobias.klauser@gmail.com>
@SeanTolstoyevski
Copy link

I've been following this issue for a long time.
But I don't know very well about compiler theory or assembly. I want to say some of my ideas and ask some questions.

Unlike the general community, I use Golang for systems programming (GUI, games).
I don't want the program deployed on these systems to retain too much information. For example I don't need traceback. This makes the code I distribute even more susceptible to reverse engineering. Ok, I'm not really obsessed with this, but I don't want everything to be there when the program crashes.
It seems that Golang holds a lot of information for traceback, even though no console screen has created.
When I examine the executables with various executable analysis tools I can clearly see the names of all the variables, some panic strings etc. No deassembler is needed for this. They really are there.

Maybe a compiler parameter could be added to remove traceback information.
This is not required for Golang products produced as a web service. But not everyone writes Golang for these jobs. We install programs that contain sensitive data on people's computers.

@josharian
Copy link
Contributor

@SeanTolstoyevski you may find this interesting: https://commaok.xyz/post/no-line-numbers/

@ianlancetaylor
Copy link
Member

@SeanTolstoyevski We're reluctant to provide a mechanism to remove traceback information because it will break runtime.Callers, which will in return break key libraries like logging libraries. There's no good way to remove only the undesirable traceback information, and the set of people who really truly want no traceback information at all is small.

@clausecker
Copy link

@SeanTolstoyevski We're reluctant to provide a mechanism to remove traceback information because it will break runtime.Callers, which will in return break key libraries like logging libraries. There's no good way to remove only the undesirable traceback information, and the set of people who really truly want no traceback information at all is small.

Wouldn't it be possible to just remove symbol names, file, and line numbers, keeping the traceback data otherwise intact? This way consumers of runtime.Callers would still somewhat work while all sensitive data is removed.

@ianlancetaylor
Copy link
Member

I'm sorry, I don't understand the suggestion. The traceback information is exactly symbol names, file names, and line numbers (see https://golang.org/pkg/runtime/#Frame). How can we remove those and leave the traceback data intact?

@clausecker
Copy link

clausecker commented Jun 4, 2021

The Frame structure already has provisions for an unknown function name (i.e. set the function name to the empty string). Similarly, File and Line could be set to nil and 0 if traceback information is stripped. What would break if this is done? If discarding function names is already too much, what about just discarding file names and line numbers?

Perhaps there was some sort of miscommunication. I assumed what you considered impossible is completely removing traceback information in the sense that runtime.Callers would be unable to retrieve an array of function pointers.

@SeanTolstoyevski
Copy link

Thanks everyone for the replies.

Data such as traceback, import path information take up enough size in executable files.
Since the title of our topic is large executable files, I thought that people can remove this information from the executable file according to their wishes.
In my opinion this should help for file size. And the scenarios I mentioned.

@ianlancetaylor In fact, there is no need to remove it completely. For example, for module paths consider something simple like .../module/modulefile.go.
This also prevents leaking the Golang project structure on one's computer.
Currently, there is a traceback structure like this. The whole path is there, and when you think about it for all go files it just feels natural to take up enough size:
C/base_path/sub_path/gopath/x/x/x/modulefile.go

Sorry for the insufficient english. I tried to explain this simply. I may have chosen the wrong words.

@seankhliao
Copy link
Member

That is available as the -trimpath build flag

@mvdan
Copy link
Member

mvdan commented Jun 4, 2021

I don't want the program deployed on these systems to retain too much information. For example I don't need traceback. This makes the code I distribute even more susceptible to reverse engineering.

You might find https://github.com/burrowers/garble useful, which attempts to remove some source code information when building. It's particularly aggressive with the -tiny flag. I personally don't think that kind of "extra stripping" needs to be supported by Go's toolchain directly, given how it's possible to do it externally, for the most part.

@ianlancetaylor
Copy link
Member

If we dropped the file/line/function information, then logging and profiling would break.

@clausecker
Copy link

If we dropped the file/line/function information, then logging and profiling would break.

I think breaking profiling is acceptable if the user desires smaller binaries. As for logging, I'm not sure what a good trade off would be.

@GooSatoru
Copy link

GooSatoru commented Dec 10, 2021

I use gomobile for my Andr and iOS app, the android architecture with arm, arm64, 386, amd64,if i support so many architecture, it will generate each architecture with a libgojni.so,the aar file size increase fast, if the size can't be shrink, i think no one will use gomobile.

dteh pushed a commit to dteh/fhttp that referenced this issue Jun 22, 2022
This shrinks a binary that just does an http.ListenAndServe by about
90kb due to using less code for the static array.

 syms
 delta   name                                                old-size  new-size  pct-difference
-54206   vendor/golang_org/x/net/http2/hpack.newStaticTable  55233     1027      -98.14%
-4744    runtime.pclntab                                     1041055   1036311   -0.46%
-204     runtime.findfunctab                                 10675     10471     -1.91%
8        runtime.typelink                                    9852      9860      0.08%
41       runtime.gcbss                                       869       910       4.72%
11711    vendor/golang_org/x/net/http2/hpack.init            572       12283     2047.38%

 sections
 delta   name             old-size  new-size  pct-difference
-41888   .text            2185840   2143952   -1.92%
-37644   .rodata          842131    804487    -4.47%
-4744    .gopclntab       1041055   1036311   -0.46%
-3343    .debug_info      981995    978652    -0.34%
-2931    .debug_line      291295    288364    -1.01%
8        .typelink        9852      9860      0.08%
59       .debug_pubnames  81986     82045     0.07%
96       .symtab          186312    186408    0.05%
113      .debug_pubtypes  137500    137613    0.08%
128      .debug_frame     219140    219268    0.06%
220      .strtab          217109    217329    0.10%
2464     .bss             127752    130216    1.93%

Updates golang/go#6853

Change-Id: I3383e63300585539507b75faac1072264d8f37e7
Reviewed-on: https://go-review.googlesource.com/43090
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
dteh pushed a commit to dteh/fhttp that referenced this issue Jun 22, 2022
Inlining isn't performed on generated init functions.  Removing the pair
function and initializing the structs directly saves another 16k on a
simple http.ListenAndServe() binary.

delta   name                                      old      new
-58     runtime.findfunctab                       10471    10413         -0.55%
-41     runtime.gcbss                             910      869           -4.51%
41      runtime.gcdata                            612      653            6.70%
-408    runtime.pclntab                           1036311  1035903       -0.04%
-11711  vendor/golang_org/x/net/http2/hpack.init  12283    572          -95.34%

Updates golang/go#6853

Change-Id: Ibccc796fe7403674cf4b4561acf9551d76ff11e8
Reviewed-on: https://go-review.googlesource.com/43190
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/462300 mentions this issue: cmd/internal/obj/x86: use push/pop instead of mov to store/load FP

gopherbot pushed a commit that referenced this issue Jan 31, 2023
This CL changes how the x86 compiler stores and loads the frame pointer
on each function prologue and epilogue, with the goal to reduce the
final binary size without affecting performance.

The compiler is currently using MOV instructions to load and store BP,
which can take from 5 to 8 bytes each.

This CL changes this approach so it emits PUSH/POP instructions instead,
which always take only 1 byte each (when operating with BP). It can also
avoid using the SUBQ/ADDQ to grow the stack for functions that have
frame pointer but does not have local variables.

On Windows, this CL reduces the go toolchain size from 15,697,920 bytes
to 15,584,768 bytes, a reduction of 0.7%.

Example of epilog and prologue for a function with 0x10 bytes of
local variables:

Before

===
 SUBQ    $0x18, SP
 MOVQ    BP, 0x10(SP)
 LEAQ    0x10(SP), BP

 ... function body ...

 MOVQ    0x10(SP), BP
 ADDQ    $0x18, SP
 RET
===

After

===
  PUSHQ   BP
  LEAQ    0(SP), BP
  SUBQ    $0x10, SP

  ... function body ...

  MOVQ    ADDQ $0x10, SP
  POPQ    BP
  RET
===

Updates #6853

Change-Id: Ice9e14bbf8dff083c5f69feb97e9a764c3ca7785
Reviewed-on: https://go-review.googlesource.com/c/go/+/462300
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/463845 mentions this issue: cmd/internal/obj/x86: use mov instead of lea to load the frame pointer

gopherbot pushed a commit that referenced this issue Jan 31, 2023
This CL instructs the Go x86 compiler to load the frame pointer address
using a MOV instead of a LEA instruction, being MOV 1 byte shorter:

Before
  55            PUSHQ   BP
  48 8d 2c 24   LEAQ    0(SP), BP

After
  55            PUSHQ   BP
  48 89 e5      MOVQ    SP, BP

This reduces the size of the Go toolchain ~0.06%.

Updates #6853

Change-Id: I5557cf34c47e871d264ba0deda9b78338681a12c
Reviewed-on: https://go-review.googlesource.com/c/go/+/463845
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
johanbrandhorst pushed a commit to Pryz/go that referenced this issue Feb 12, 2023
This CL changes how the x86 compiler stores and loads the frame pointer
on each function prologue and epilogue, with the goal to reduce the
final binary size without affecting performance.

The compiler is currently using MOV instructions to load and store BP,
which can take from 5 to 8 bytes each.

This CL changes this approach so it emits PUSH/POP instructions instead,
which always take only 1 byte each (when operating with BP). It can also
avoid using the SUBQ/ADDQ to grow the stack for functions that have
frame pointer but does not have local variables.

On Windows, this CL reduces the go toolchain size from 15,697,920 bytes
to 15,584,768 bytes, a reduction of 0.7%.

Example of epilog and prologue for a function with 0x10 bytes of
local variables:

Before

===
 SUBQ    $0x18, SP
 MOVQ    BP, 0x10(SP)
 LEAQ    0x10(SP), BP

 ... function body ...

 MOVQ    0x10(SP), BP
 ADDQ    $0x18, SP
 RET
===

After

===
  PUSHQ   BP
  LEAQ    0(SP), BP
  SUBQ    $0x10, SP

  ... function body ...

  MOVQ    ADDQ $0x10, SP
  POPQ    BP
  RET
===

Updates golang#6853

Change-Id: Ice9e14bbf8dff083c5f69feb97e9a764c3ca7785
Reviewed-on: https://go-review.googlesource.com/c/go/+/462300
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
johanbrandhorst pushed a commit to Pryz/go that referenced this issue Feb 12, 2023
This CL instructs the Go x86 compiler to load the frame pointer address
using a MOV instead of a LEA instruction, being MOV 1 byte shorter:

Before
  55            PUSHQ   BP
  48 8d 2c 24   LEAQ    0(SP), BP

After
  55            PUSHQ   BP
  48 89 e5      MOVQ    SP, BP

This reduces the size of the Go toolchain ~0.06%.

Updates golang#6853

Change-Id: I5557cf34c47e871d264ba0deda9b78338681a12c
Reviewed-on: https://go-review.googlesource.com/c/go/+/463845
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binary-size NeedsFix The path to resolution is known, but the work has not been done. umbrella
Projects
None yet
Development

No branches or pull requests