-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#65738 causes SIGSEGV running helloworld on linux-x86 preview 3 #68391
Comments
I can also confirm the same issue seems to persist in Stack trace is:
|
@ta264 thank you for reporting this issue, you are definitely right and the issue is caused by my changes. I'll fix it. |
Thanks! |
@ta264 it is strange, but it works ok for me (tested with build from the main branch). I have set a breakpoint at the FixupPrecode::GenerateCodePage and it has passed multiple times and generated expected code. I wonder, which clang version have you used to build it? |
I was using clang 9, which is the one we use for official builds. |
@janvorli thanks very much for investigating I was using clang 9. This is the docker file I was using to generate the build environment and crossrootfs: Then I make the patches listed here: Then build runtime in two stages: Which mirrors how we had to do it on freebsd to get everything working. Finally, I'm testing with |
Can I ask how you're building it? It's very possible my edits to make it build are having unintended consequences |
I've used the sed and options to the build commands from your https://github.com/Servarr/dotnet-linux-x86/blob/4b0474c30e5ea2900e14fbe94831d64d7f44b318/azure-pipelines.yml from the Runtime part. I've just executed them manually and without a docker and without the official build id passed in. And using the build.sh from the root of the repo, not from the eng folder. ROOTFS_DIR=~/rootfs/x86 ./eng/common/cross/build-rootfs.sh x86 bionic I was running it in a docker container created from Would you be able to share a disassembly of the code of the crashing method? Just running the GDB |
Dump of assembler code for function FixupPrecode::GenerateCodePage(unsigned char*, unsigned char*): [0/1333]
0xf747b810 <+0>: push %ebp
0xf747b811 <+1>: mov %esp,%ebp
0xf747b813 <+3>: push %ebx
0xf747b814 <+4>: push %edi
0xf747b815 <+5>: push %esi
0xf747b816 <+6>: sub $0xc,%esp
0xf747b819 <+9>: call 0xf747b81e <FixupPrecode::GenerateCodePage(unsigned char*, unsigned char*)+14>
0xf747b81e <+14>: pop %ebx
0xf747b81f <+15>: add $0x55c40a,%ebx
0xf747b825 <+21>: call 0xf76d0210 <GetOsPageSize()>
0xf747b82a <+26>: mov %eax,%ecx
0xf747b82c <+28>: mov $0x2aaaaaab,%edx
0xf747b831 <+33>: imul %edx
0xf747b833 <+35>: mov %edx,%eax
0xf747b835 <+37>: shr $0x1f,%eax
0xf747b838 <+40>: shr $0x2,%edx
0xf747b83b <+43>: add %eax,%edx
0xf747b83d <+45>: shl $0x3,%edx
0xf747b840 <+48>: lea (%edx,%edx,2),%edi
0xf747b843 <+51>: test %edi,%edi
0xf747b845 <+53>: jle 0xf747b8ca <FixupPrecode::GenerateCodePage(unsigned char*, unsigned char*)+186>
0xf747b84b <+59>: mov 0x8(%ebp),%edx
0xf747b84e <+62>: add 0xc(%ebp),%ecx
0xf747b851 <+65>: mov 0x570(%ebx),%eax
0xf747b857 <+71>: add %edx,%eax
0xf747b859 <+73>: mov %eax,-0x18(%ebp)
0xf747b85c <+76>: mov 0x116c(%ebx),%eax
0xf747b862 <+82>: add %edx,%eax
0xf747b864 <+84>: mov %eax,-0x14(%ebp)
0xf747b867 <+87>: mov 0x11f8(%ebx),%eax
0xf747b86d <+93>: add %edx,%eax
0xf747b86f <+95>: mov %eax,-0x10(%ebp)
0xf747b872 <+98>: mov 0x1038(%ebx),%ebx
0xf747b878 <+104>: xor %esi,%esi
0xf747b87a <+106>: nop
0xf747b87b <+107>: nop
0xf747b87c <+108>: nop
0xf747b87d <+109>: nop
0xf747b87e <+110>: nop
0xf747b87f <+111>: nop
0xf747b880 <+112>: movsd 0x10(%ebx),%xmm0
0xf747b885 <+117>: mov 0x8(%ebp),%eax
0xf747b888 <+120>: movsd %xmm0,0x10(%eax,%esi,1)
0xf747b88e <+126>: movsd (%ebx),%xmm0
0xf747b892 <+130>: movsd 0x8(%ebx),%xmm1
0xf747b897 <+135>: movsd %xmm1,0x8(%eax,%esi,1)
0xf747b89d <+141>: movsd %xmm0,(%eax,%esi,1)
0xf747b8a2 <+146>: mov %edi,%eax
0xf747b8a4 <+148>: lea (%ecx,%esi,1),%edi
0xf747b8a7 <+151>: mov -0x10(%ebp),%edx
=> 0xf747b8aa <+154>: mov %edi,(%edx,%esi,1)
0xf747b8ad <+157>: lea 0x4(%ecx,%esi,1),%edi
0xf747b8b1 <+161>: mov -0x14(%ebp),%edx
0xf747b8b4 <+164>: mov %edi,(%edx,%esi,1)
0xf747b8b7 <+167>: lea 0x8(%ecx,%esi,1),%edi
0xf747b8bb <+171>: mov -0x18(%ebp),%edx
0xf747b8be <+174>: mov %edi,(%edx,%esi,1)
0xf747b8c1 <+177>: mov %eax,%edi
0xf747b8c3 <+179>: add $0x18,%esi
0xf747b8c6 <+182>: cmp %eax,%esi
0xf747b8c8 <+184>: jl 0xf747b880 <FixupPrecode::GenerateCodePage(unsigned char*, unsigned char*)+112>
0xf747b8ca <+186>: add $0xc,%esp
0xf747b8cd <+189>: pop %esi
0xf747b8ce <+190>: pop %edi
0xf747b8cf <+191>: pop %ebx
0xf747b8d0 <+192>: pop %ebp
0xf747b8d1 <+193>: ret
Let me know if I can provide anything else, happy to help. I tried running under |
I'm struggling to get it to build under an 18.04 rootfs, debootstrap is failing in docker and native I end up with
Any tips appreciated! |
Here are the exact steps I've used to build the rootfs and then build the runtime using it from the main branch. I have tried it from scratch to make sure nothing stale is left in my repo by accident. I am running the build on x64 Ubuntu 18.04 native box. First checkout the dotnet/runtime repo and apply the "sed" changes that you've shared: In the dotnet/arcade repo: sudo ROOTFS_DIR=/home/janvorli/rootfs/x86.latest eng/common/cross/build-rootfs.sh x86 bionic In the dotnet/runtime repo: ROOTFS_DIR=/home/janvorli/rootfs/x86.latest ./build.sh -c Release -cross -os Linux -arch x86 -clang9 -subset Clr.Native+Host.Native
ROOTFS_DIR=/home/janvorli/rootfs/x86.latest ./build.sh -c Release -cross -os Linux -arch x86 -clang9 /p:AppHostSourcePath=/home/janvorli/git/runtime/artifacts/obj/linux-x86.Release/apphost/standalone/apphost This completes cleanly for me. Since all the platform specific libraries should come from the rootfs, I would expect it to work for you the same way. |
Thanks a lot for the details. I'll investigate further and let you know. |
I managed to get it to compile with the bionic rootfs - I had to do the build inside an 18.04 docker container. Not sure why it didn't like the underlying 20.04 VM. I'll see what happens if I try a 20.04 container. Output from the bionic rootfs and 18.04 containr fails with the same segfault as before. |
Ok, how do you apply the stuff built by the steps to the actual testing application? I would like to test it locally the same way as you do. |
I created a helloworld app on the host using
Then from a directory adjacent to my
Finally, with my
Let me know if I can give any more details / do anything else to help. Would it help to create version of the pipeline that produces a runtime with the segfault so you can see the precise steps and download the result? |
I've followed your steps with little augmentation as my build didn't use the official id so the package names were a bit different. Here are my commands: tar -xf ../runtime/artifacts/packages/Release/Shipping/dotnet-runtime-7.0.0-dev-linux-x86.tar.gz ./
tar -xf ../runtime/artifacts/packages/Release/Shipping/dotnet-runtime-symbols-linux-x86-7.0.0-dev.tar.gz -C shared/Microsoft.NETCore.App/7.0.0-dev/ I had to update the version in I have used docker container built from the image And the test printed "Hello World". So I wonder what can be causing the difference. I've tried even without the |
I have tweaked my pipeline to just produce runtime and build I can replicate the failure by grabbing and extracting the relevant artifacts :
Then testing in a container like this (which is not priviledged but gdb still seems to work? I've not really used gdb before)
It does seem very odd. Happy to try your One question: when I build without official build id I get something named |
I guess you get the {
"runtimeOptions": {
"tfm": "net7.0",
"framework": {
"name": "Microsoft.NETCore.App",
"version": "7.0.0-dev"
}
}
} I think there is a way to pass the version to build or set it in the project, but I am not an expert on these things, so I've resorted to plain editing of this file. I'll give your files a try and let you know. If that works for me, it would mean that there is something specific to your testing host machine, e.g. something in the kernel (docker shares the kernel with the host). I could share some debugging instructions to figure out what's wrong. Just as a sanity check, does the |
From the host:
From the testing container:
|
And to confirm, dropping the Happy to go through any debugging instructions, just let me know. Thanks again for all your time. |
In case it helps, I just tried using the output from the azure pipeline in a fresh ubuntu 18.04 VM and I hit the same SIGSEGV |
So your build crashes on my machine as well while my build works. I wonder, have you built it with clang9 (passing |
When the native part of the runtime build begins, it prints the compiler version to the console, so you could also use that to see which clang was used. |
That's interesting, at least it's reproducible! I assume it's clang9 since that's all I installed in the dockerfile This is a snippet from the build log:
So it looks like clang9. I've also tried locally with the clang9 option and it still fails. |
I have stepped through the crashing code both with my build and your build. Your build somehow gets incorrect values of FixupPrecodeCode_Target_Offset, FixupPrecodeCode_MethodDesc_Offset and FixupPrecodeCode_PrecodeFixupThunk_Offset. Instead of 2, 7, and 0xd, it gets 0xed5d9002, 0xed5d9007 and 0xed5d900d. That results in this write to go to a completely wrong location:
Interestingly, the debugger prints the offset values correctly, so it seems like some linker issue. This is the code that gets those offsets and stores them in [ebp-0x10, 0x14 and 0x18] for use in the loop:
That sounds like a possible linker bug. Do you have lld-9 installed? I do, so my build was using it. We fall back to using GNU ld if lld is not found. |
I do not have that installed inside the build docker or natively. I'll try a build with that installed. |
Making sure |
Great, thank you for the confirmation! |
Rebuilt the full Preview3 SDK with lld-9 and it works, it can generate build and run the So from my point of view I think this is resolved, thanks again. Might be worth disabling the |
I was happy to help and I am glad we've found the culprit. |
Thanks again |
Description
I can build a functional SDK for linux-x86 for .NET 7 Preview 2 once I fix #68044 and disable ReadyToRun (which causes other issues I will dig into next)
This can run a hello world application OK.
After updating to preview 3, it fails with SIGSEGV. I bisected and the failure is introduced by eb8460f from #65738
If I revert this commit and rebuild preview 3 then everything works again - see the pipeline here:
https://github.com/Servarr/dotnet-linux-x86/tree/4b0474c30e5ea2900e14fbe94831d64d7f44b318
Reproduction Steps
On a linux-x64 host with a preview 3 SDK, run
dotnet new console && dotnet build
Build .NET 7 preview 3 runtime for linux-x86. Example pipeline is here:
https://github.com/Servarr/dotnet-linux-x86/tree/4b0474c30e5ea2900e14fbe94831d64d7f44b318
Remove this line reverting the commit causing the issue:
https://github.com/Servarr/dotnet-linux-x86/blob/4b0474c30e5ea2900e14fbe94831d64d7f44b318/azure-pipelines.yml#L77
Run the output on the linux-x86 host (I'm using a ubuntu 20.04 docker with mulitlib support enabled) using the runtime generated above.
Expected behavior
Hello, World!
Actual behavior
Regression?
This is a regression from .NET 7 Preview 2
Known Workarounds
Rebuild Preview 3 with eb8460fd29f reverted
Configuration
.NET 7 Preview 3
Cross compiling on Ubuntu 20.04 for
linux-x86
Running the output in an Ubuntu 20.04 docker with multilib support enabled
Other information
I think the issue is introduced by eb8460f from #65738 authored by @janvorli (I hope this is not inappropriate, apologies in advance if I shouldn't have mentioned you)
The text was updated successfully, but these errors were encountered: