-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
misc/cgo/testshared: shared libs tests fail on arm64 with segmentation fault #28334
Comments
The issue is reproducible with go1.9.4. Only that I had to: The error log was almost the same if not exactly the same. |
Which linker (C linker, not Go cmd/link) are you using, bfd linker or gold or lld? Which version? I vaguely remember that some old version of gold linker on ARM64 doesn't work well with this. |
Glibc in Oracle Linux seems to be based on 2.17. However, there seems a bunch of backports to it from the latest glibc upstream. The linker seems to be "Gold." What I suspect is that this bug might be reproducible with new glibcs, not old glibcs. The issue 24873 says that it has problems with glibc 2.27. The golang:1.11.1 docker container has no problem but the glibc is as old as 2.24. I am pretty sure that this new Oracle Linux package has patches backported even from the latest release of glibc. |
I have looked into this bug a bit more. Firstly, I modified the following file, and ran ./run.bash -run testshared:
The log is here as I cannot find a menu to attach a file: All the broken test cases are an executable. They are built with "-linkshared" option. Sometimes, they have "-buildmode=shared," and sometimes no "-buildmode." Interestingly, when "-buildmode=pie" is present, I did not see a segmentation fault. It seems that two test benches have the option. More interestingly, a test bench called "trivial" is built twice: once with -buildmode=pie and once without it. It gets segmentation fault only when -buildmode=pie was not given. |
After binary searching the glibc of the distro, I could see the backport of this patch caused the issue: backport of Sep 15, 2017 commit 6cd380dd366d728da9f579eeb9f7f4c47f48e474 eXecute-Only Memory (XOM) is a protection mechanism against some ROP I add MOVL macro with movz/movk instructions like movl pseudo-instruction
|
https://patchwork.ozlabs.org/patch/810876/ The patch above to glibc is applied to the distro's glibc, and triggered the issue. In start.S, the "MOVL" macro defined in sysdeps/aarch64/sysdep.h is used. The macros are like this:
In start.S, the _start function is supposed to set x0, whose value is used to branch from __glibc_start_main to a function. The value is set to 0 with the patch; i.e. with the MOVL macro. On the contrary, the value is set to something, which passed the "testshared" tests. I have tried to build an executable with gcc without -pie. I think the same MOVL macro was used. However, this time, the address is assigned appropriately. I am not yet sure which component is buggy; runtime linker, assembler, or "go link." |
For the case named "trivial" that fails with a segmentation fault, I did the following after increasing verbosity by touching misc/cgo/testshared/src/*.go: The last link command to build the executable was: There, I also passed -v to the "link" and -extldflags "-fuse-ld=bfd" I think the external linker took the option. This time, I did not see the segmentation fault. Thus, I guess somehow when gold linker replaces #:abs_g0_nc:main, it fails to use the right value. Here is the objdump --all /usr/lib64/crt1.o: SYMBOL TABLE: RELOCATION RECORDS FOR [.text]: I am not a linker expert. However, I do not see that much difference between main and __libc_csu_fini. I guess this might be the combination of the gold linker and go tools. |
Finally, I think I know what's going on here. Also, in my opinion, this issue should be fixed in Go tools. Here are the summary and justification. In short, the "main" function is NOT in the go.o but in the Go-tool-built .so file. Thus, it seems that the external linker should link them against Scrti.o rather than crti.o. However, the go tool, which seems "go link," invokes gold to link them against crti.o. This has not happened because the crti.o and Scrti.o had had no difference in terms of this issue. The glibc has changed on the upgrade. The thing is the very upstream glibc as of today still has the glibc change. Therefore, I assume that glibc in multiple distro is moving toward this direction, which would cause the reported issue (#28334). In my opinion, Go link tool needs some change to address the issue. Here are the details of the problem, and how I reach the conclusion. Basically, on Oracle Linux docker container for aarch64, which seems OL 7.5 according to the tag, the "misc/cgo/testshared" failed as described in the very first comments in this issue report. As an example, let's see misc/cgo/testshared/src/trivial. The test lets Go tools build libruntime,sync-atomic.so, and the executable named "trivial." I believe the following two commands are what happened at the top level.
The very last "go link" command to create the executable, "trivial," seems like this:
I added -extldflags="-v -Wl,-v" after -extld=gcc. I could see that what is the linker command actually used. It looks as follows:
Please, note that crti.o is used rather than Scrti.o. Somehow, I hacked the "go link," so it does not delete the go.o. I found the libruntime,sync-atomic.so by giving -work option. Seems like go.o does NOT have "main." Instead, the .so has it:
|
It appears to me that the problem is what the linker should do if it is requested to link a PIC libmain.so that has THE main function and a non-PIC foo.o that has some utility functions. Regarding the testshared, that appears to be what Go tools do.
Then, I built libmain.so and the non-PIC foo.o as follows:
Following that, the executable is built like this:
What is the correct behavior of a linker? If main were in the non-PIC .o and linked against the PIC .so file, both linker used crti.o and have no problem:
|
As mentioned elsewhere, it looks like you have a pure C test case that should be reported at https://sourceware.org/bugzilla. |
For the record this was reported at https://sourceware.org/bugzilla/show_bug.cgi?id=23870 . Thanks! |
Likely nothing to do in the Go tools, so closing this issue. |
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?1.9.4, 1.10.3, 1.11.1, and perhaps more
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?GOARCH="arm64"
GOBIN=""
GOEXE=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/aion1223/go"
GORACE=""
GOROOT="/usr/lib/golang"
GOTOOLDIR="/usr/lib/golang/pkg/tool/linux_arm64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build048848957=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
What did you do?
git clone https://github.com/golang/go go
cd go/src
git fetch --all
git checkout -b mygolang 'git rev-list -n 1 go1.11.1'
./all.bash
I have tried go1.11.1 and go1.10.3 in the same way.
What did you expect to see?
The build process terminates successfully after passing all the tests
What did you see instead?
../misc/cgo/testshared
--- FAIL: TestTrivialExecutable (3.33s)
shared_test.go:41: executing ./bin/trivial (trivial executable) failed signal: segmentation fault (core dumped):
--- FAIL: TestDivisionExecutable (0.59s)
shared_test.go:41: executing ./bin/division (division executable) failed signal: segmentation fault (core dumped):
--- FAIL: TestCgoExecutable (1.15s)
shared_test.go:41: executing ./bin/execgo (cgo executable) failed signal: segmentation fault (core dumped):
--- FAIL: TestGopathShlib (3.47s)
shared_test.go:41: executing ./bin/exe (executable linked to GOPATH library) failed signal: segmentation fault (core dumped):
--- FAIL: TestTwoGopathShlibs (3.42s)
shared_test.go:41: executing ./bin/exe2 (executable linked to GOPATH library) failed signal: segmentation fault (core dumped):
--- FAIL: TestThreeGopathShlibs (5.23s)
shared_test.go:41: executing ./bin/exe3 (executable linked to GOPATH library) failed signal: segmentation fault (core dumped):
--- FAIL: TestABIChecking (3.08s)
shared_test.go:861: exe failed, but without line "abi mismatch detected between the executable and libdepBase.so"; got output:
--- FAIL: TestImplicitInclusion (1.35s)
shared_test.go:41: executing ./bin/implicitcmd (running executable linked against library that contains same package as it) failed signal: segmentation fault (core dumped):
--- FAIL: TestInterface (1.93s)
shared_test.go:41: executing ./bin/iface (running type/itab uniqueness tester) failed signal: segmentation fault (core dumped):
--- FAIL: TestGlobal (1.33s)
shared_test.go:41: executing ./bin/global (global executable) failed signal: segmentation fault (core dumped):
2018/10/23 10:44:37 executing go test -installsuffix=8674665223082153551 -linkshared -test.short sync/atomic failed exit status 1:
signal: segmentation fault (core dumped)
FAIL sync/atomic 0.088s
exit status 1
FAIL _/home/aion1223/go/misc/cgo/testshared 43.680s
misc/cgo/testshared failed with segmentation fault. Yes, it exactly looks like the following link:
#24873
However, unlike the issue, my issue is reproducible with all versions of go I know. My issue is reproducible only on Oracle Linux 7.5 for ARM64, which is available as a docker image. I could not reproduce it with the golang:1.11.1 docker container available in Debian. I did see the fix of 24873 is already in the go1.11.1 source code.
The text was updated successfully, but these errors were encountered: