-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: go 1.8 + ARM + 64k pages not working #18408
Comments
Could @DTrace001 please run the hello world program under gdb
to see where exactly is triggering the bad jump to address 0x88868?
Run the program under gdb, and run the following three gdb commands
when it stops due to SIGSEGV:
1. run "bt" to see where it's when the SIGSEGV happens.
2. run "x/10i $pc-12" to show the instructions that triggers the segfault.
3. run "info reg" to show the register contents.
And paste the gdb output here. Thanks.
|
@DTrace001, while you're in gdb stopped at the SIGSEGV, could you also run Can you also confirm that the "hello" binary you're running is the one from http://pub.rclone.org/rclone-v1.34-75-gcbfec0d-arm-go-tip.zip (sha1sum da609dc3998c7f080e380c469aca6d3c623472c1)? |
I will first need to compile gdb, for the wdmycloud device using 64K page size, since it does not come native to it. From there I can continue to help debug the issue. They did not make it easy to run custom programs on this device. Sorry, I probably wont be able to get to do this until early next week. |
How about using gdbsever on the device? It should be much easier
to compile than gdb.
|
@minux Sorry but I am not having luck compiling gdb on a debian wheezy host for armhf with 64k page size. The wdmycloud device is a hacked version of debian wheezy running armhf 64k page size. |
@DTrace001, in that case, can we get you to make a core dump? Set |
@minux, just looking at the binary, that address is suspiciously near the boundary between these two segments:
The faulting address (0x88868) falls right near the end of the first of these two segments (0x888cc). The start of the second segment rounds down to 0x80000. I'm guessing the kernel mapped this wrong (even though in principle it could have done it right) and the page wound up marked no-execute. I think the core file should have enough information to confirm this. |
Yeah, I think that's the culprit. We need to increase executable
segment alignment.
|
Excellent debugging!
…On Thu, Dec 29, 2016 at 11:52 AM, Minux Ma ***@***.***> wrote:
Yeah, I think that's the culprit. We need to increase executable
segment alignment.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#18408 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAcA2KEbN1_EeBQAToLmeyl7_vczsvPks5rMwQygaJpZM4LTZZH>
.
|
How hard would that be to do? Could we do it before the rc on Sunday? I think we've already figured out the maximum page size for each GOARCH in runtime/internal/sys.DefaultPhysPageSize. We can base the alignments on those. It looks like the default page size on Linux for all of the arches is 4K, which is probably why we're not seeing a more wide-spread problem with this. /cc @randall77 @rsc |
I just read through Linux's binfmt_elf and confirmed that it doesn't try to do anything smart with overlapping segments. It just maps them in the order and with the permissions given, rounded out to page boundaries. So, if we want to be clever, we don't need to align all segments to the maximum page boundary. We only need to do this where we go from more to less permissive segments, such as the read+exec segment to the read-only segment. But it's probably better to just align all of the segments unless that causes too much binary bloat, which I doubt that it will. |
Hope this helps.
|
@DTrace001, thanks, but it looks like you forgot to attach the actual "core" file that produced. |
@aclements Sorry updated comment with core.zip attached |
Thanks for the core file. This further confirms that the segments are getting rounded such that we're losing the executable bit on part of the text section. We can compare the LOAD segments from the original binary with the LOAD segments in the core file: Original binary:
Core file:
The executable segment got truncated to a length of 0x70000 bytes from 0x788cc and the start of the read-only segment got rounded down to 0x80000, which means the faulting PC 0x88868 lands in the read-only segment instead of the executable segment. |
@DTrace001 or @ncw, can you build the test binary with the |
As a quick experiment, I rebuilt all of the usual Go binaries with 64K segment alignment (on amd64). On average, they got 3% larger:
|
@DTrace001 I've built a binary with this for you to try
Here it is (zipped): hello.zip |
We already set the default FlagRound to 64k on a variety of GOOS/GOARCH combinations. It's easy to add linux/arm. I sent CL 34629. |
@ncw @aclements Jackpot !!!
|
CL https://golang.org/cl/34629 mentions this issue. |
We've now confirmed that this fixes the original issue in rclone/rclone#426. Very pleased we can get this into 1.8. Thank you all for your help. |
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?What operating system and processor architecture are you using (
go env
)?Compile host
Target machine
What did you do?
Compile this program and run it: https://play.golang.org/p/sqjFommznr
What did you expect to see?
Hello, World!
What did you see instead?
This issue was discovered as part of rclone/rclone#426. @DTrace001 is the one with the hardware, I'm just reporting the problem.
The text was updated successfully, but these errors were encountered: