[Arm64] Implement stack probing using helper #13519

BruceForstall · 2019-10-02T17:46:00Z

x86/x64 was implemented with dotnet/coreclr#26807. This issue tracks doing the work to implement it for arm32/arm64.

This would provide consistency in implementation, simplicity in the JIT stack probing implementation, as well as provide the benefits of stack probing exception stack traces for arm32/arm64.

Related: https://github.com/dotnet/coreclr/issues/21061

Update: Arm32 part of the issue is addressed in dotnet/coreclr#27184
For Arm64 we decided only to fix the stack probing loop (it currently under-probes one page) without implementing the helper.

In the future, we can implement the helper by following the suggested approach dotnet/coreclr#27184 (comment) but it is out of scope for the near term future

category:implementation
theme:prolog-epilog
skill-level:intermediate
cost:medium

echesakov · 2019-10-18T17:47:13Z

I was testing my implementation of stack probing using helpers on linux-arm and comparing its behavior with current implementation of the stack probing using inlined loops. I believe that the current implementation is under-probing one page.

For example, below I have a disassembly of a funclet with large outgoing argument space (32712 bytes).

(gdb) disassemble 0xaa0140d0,+50
Dump of assembler code from 0xaa0140d0 to 0xaa014102:
   0xaa0140d0:  stmdb   sp!, {r4, r10, r11, lr}
   0xaa0140d4:  movw    r3, #61440      ; 0xf000
   0xaa0140d8:  sxth    r3, r3
   0xaa0140da:  movw    r2, #32824      ; 0x8038
   0xaa0140de:  sxth    r2, r2
   0xaa0140e0:  ldr.w   r1, [sp, r3]
   0xaa0140e4:  sub.w   r3, r3, dotnet/coreclr#4096   ; 0x1000
   0xaa0140e8:  cmp     r2, r3
   0xaa0140ea:  bls.n   0xaa0140e0
=> 0xaa0140ec:  add     sp, r2
   0xaa0140ee:  add.w   r3, r11, dotnet/coreclr#8
   0xaa0140f2:  movw    r10, #32708     ; 0x7fc4
   0xaa0140f6:  str.w   r3, [sp, r10]
   0xaa0140fa:  movs    r2, #0
   0xaa0140fc:  movs    r3, #0
   0xaa0140fe:  vmov    d4, r2, r3
End of assembler dump.
(gdb) info reg r2 r3 sp
r2             0xffff8038       4294934584
r3             0xffff8000       4294934528
sp             0xbea07398       0xbea07398

The thread stack ends at 0xBEA00000.

1121:   /mnt/ssd/git/coreclr/BinDir_Linux_arm_debug/corerun GitHub_21061_StackOverflowInFuncletProlog.exe
Address   Kbytes Mode  Offset           Device    Mapping
00400000      16 r-x-- 0000000000000000 008:00001 corerun
00413000       4 r---- 0000000000003000 008:00001 corerun
00414000       4 rw--- 0000000000004000 008:00001 corerun
.
.
b6fff000       4 rw--- 0000000000019000 0b3:00002 ld-2.27.so
bea00000    6144 rw--- 0000000000000000 000:00000   [ stack ]
ffff0000       4 r-x-- 0000000000000000 000:00000   [ anon ]
mapped: 222664K    writeable/private: 50228K    shared: 74224K

Below is summary of what happens in the loop:

Initial SP at the beginning of funclet prolog 0xBEA07398
Funclet frame size is 0x7FC8 (32712 bytes)
Last probed address is 0xBEA00398 = 0xBEA07398‬ - 0x7000
First address on the last probed page 0xBEA00000
First address on the first unprobed page 0xBE9FF000‬. Note that this address doesn't belong to stack.
First address accessed after the funclet prolog 0xBE9FF3D0

The funclet segfaults in the boby of funclet

Thread 1 "corerun" received signal SIGSEGV, Segmentation fault.
0xaa014102 in ?? ()
(gdb) bt
#0  0xaa014102 in ?? ()
dotnet/coreclr#1  0xb66653a8 in CallEHFunclet () at /__w/3/s/src/vm/arm/ehhelpers.S:100
dotnet/coreclr#2  0xb6649f52 in ExceptionTracker::CallHandler (this=0x44e760, uHandlerStartPC=2852208848, sf=..., pEHClause=0x44e7b4, pMD=0xb25fb91c, funcletType=Catch, pContextRecord=0xbea07da0)
    at /__w/3/s/src/vm/exceptionhandling.cpp:3405
dotnet/coreclr#3  0xb6649ce2 in ExceptionTracker::CallCatchHandler (this=0x44e760, pContextRecord=0xbea07da0, pfAborting=0xbea077bf) at /__w/3/s/src/vm/exceptionhandling.cpp:656
dotnet/coreclr#4  0xb664b0f2 in ProcessCLRException (pExceptionRecord=0x482218, MemoryStackFp=3204440864, pContextRecord=0xbea07da0, pDispatcherContext=0xbea079fc) at /__w/3/s/src/vm/exceptionhandling.cpp:1192
dotnet/coreclr#5  0xb6650af0 in UnwindManagedExceptionPass2 (ex=..., unwindStartContext=0xbea07da0) at /__w/3/s/src/vm/exceptionhandling.cpp:4489
dotnet/coreclr#6  0xb6650f10 in UnwindManagedExceptionPass1 (ex=..., frameContext=0xbea07fe8) at /__w/3/s/src/vm/exceptionhandling.cpp:4651
dotnet/coreclr#7  0xb6651466 in DispatchManagedException (ex=..., isHardwareException=false) at /__w/3/s/src/vm/exceptionhandling.cpp:4777
dotnet/coreclr#8  0xb6566a50 in __FCThrow (__me=0x0, reKind=kDivideByZeroException, resID=0, arg1=0x0, arg2=0x0, arg3=0x0) at /__w/3/s/src/vm/fcall.cpp:56
dotnet/coreclr#9  0xb6577f6c in JIT_Div (dividend=1, divisor=0) at /__w/3/s/src/vm/jithelpers.cpp:277
dotnet/coreclr#10 0xaa0140ae in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

(gdb) disassemble 0xaa0140d0,0xaa014106
Dump of assembler code from 0xaa0140d0 to 0xaa014106:
   0xaa0140d0:  stmdb   sp!, {r4, r10, r11, lr}
   0xaa0140d4:  movw    r3, #61440      ; 0xf000
   0xaa0140d8:  sxth    r3, r3
   0xaa0140da:  movw    r2, #32824      ; 0x8038
   0xaa0140de:  sxth    r2, r2
   0xaa0140e0:  ldr.w   r1, [sp, r3]
   0xaa0140e4:  sub.w   r3, r3, dotnet/coreclr#4096   ; 0x1000
   0xaa0140e8:  cmp     r2, r3
   0xaa0140ea:  bls.n   0xaa0140e0
   0xaa0140ec:  add     sp, r2
   0xaa0140ee:  add.w   r3, r11, dotnet/coreclr#8
   0xaa0140f2:  movw    r10, #32708     ; 0x7fc4
   0xaa0140f6:  str.w   r3, [sp, r10]
   0xaa0140fa:  movs    r2, #0
   0xaa0140fc:  movs    r3, #0
   0xaa0140fe:  vmov    d4, r2, r3
=> 0xaa014102:  vstr    d4, [sp]
(gdb) info reg sp
sp             0xbe9ff3d0       0xbe9ff3d0

The analysis is done on top of ef3180c

janvorli · 2020-02-10T11:40:50Z

When implementing printing stack trace at stack overflow, I have found that the fact that we don't move SP during the probing on Linux on ARM64 (as we haven't implemented the probing using helper for arm64) is causing a problem. To print the stack overflow stack, we need about 28kB of stack space. So when a stack overflow is detected in the SIGSEGV handler, we switch to a special preallocated stack of that size and run the exception handling on it. But when we hit a sigseg, we cannot get the actual stack limits as calling the function to get the limits is not allowed from an async signal handler when you don't know what code has triggered it. So we consider it to be a stack overflow based on whether the memory accessed was +/- a page around the SP.
That means that without moving SP during stack probing, we don't consider it to be a stack overflow at this point (where we run on a per-thread alternate stack for handling SIGSEGVs). So instead of switching to the special stack overflow stack, we switch back to the original stack of the thread.

Then in both cases we run the common_signal_handler that is common for all hardware exceptions. At this point, it is possible that we only have a little over one memory page of stack space left if we are executing on the original stack. That is enough for checking if we are running in managed code and if we are, we can read the actual stack limits and detect the stack overflow even for probing without the helper.
But the remaining stack size is not sufficient for printing the stack trace. Before, we were just printing the "Stack overflow" message and aborting the process. Now we actually call the hardware exception handler in runtime that checks for stack overflow and ends up calling the EEPolicy::HandleFatalStackOverflow that dumps the stack trace.

If the probing helper that moves SP while probing was implemented for arm64, this problem would go away as we would never hit this code path.

AndyAyersMS · 2020-04-29T01:52:54Z

@BruceForstall I take it you think this should stay in 5.0?

BruceForstall · 2020-04-29T17:01:17Z

Yes, @echesakovMSFT plans to get to it in 5.0

echesakov · 2020-06-23T00:44:36Z

This should be moved to 6.0 - I won't have time to work on this

echesakov · 2021-05-05T18:26:09Z

I implemented the functionality of stack probing with helpers on arm64 and posted the changes in https://github.com/echesakovMSFT/runtime/tree/Arm64-Implement-Jit-StackProbe-Helper
However, the further work is blocked by #47810 and given that work on that issue will not be done in 6.0 I move this issue to Future.

BruceForstall assigned echesakov Oct 2, 2019

echesakov changed the title ~~Implement stack probing using helper for arm32/arm64~~ [Arm64] Implement stack probing using helper Oct 31, 2019

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

janvorli mentioned this issue Feb 8, 2020

Display stack trace at stack overflow #31956

Merged

BruceForstall modified the milestones: Future, 5.0 Mar 12, 2020

echesakov modified the milestones: 5.0.0, 6.0.0 Jun 23, 2020

echesakov mentioned this issue Sep 11, 2020

arm64 skippage6.sh test fails to JIT code when page size > 4KB #42023

Closed

echesakov mentioned this issue Oct 10, 2020

[Arm64] Implement stack probe helper #43250

Closed

echesakov mentioned this issue Oct 20, 2020

[Arm64] Planned JIT work in .NET 6 #43629

Closed

29 tasks

BruceForstall added JitUntriaged CLR JIT issues needing additional triage and removed JitUntriaged CLR JIT issues needing additional triage labels Oct 28, 2020

echesakov added the in-pr There is an active PR which will close this issue when it is merged label Jan 29, 2021

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 26, 2021

echesakov modified the milestones: 6.0.0, Future May 5, 2021

BruceForstall mentioned this issue Jul 31, 2023

baseservices/exceptions/stackoverflow/stackoverflowtester/stackoverflowtester fails on Linux #46175

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Arm64] Implement stack probing using helper #13519

[Arm64] Implement stack probing using helper #13519

BruceForstall commented Oct 2, 2019

echesakov commented Oct 18, 2019 •

edited

Loading

janvorli commented Feb 10, 2020

AndyAyersMS commented Apr 29, 2020

BruceForstall commented Apr 29, 2020

echesakov commented Jun 23, 2020

echesakov commented May 5, 2021

[Arm64] Implement stack probing using helper #13519

[Arm64] Implement stack probing using helper #13519

Comments

BruceForstall commented Oct 2, 2019

echesakov commented Oct 18, 2019 • edited Loading

janvorli commented Feb 10, 2020

AndyAyersMS commented Apr 29, 2020

BruceForstall commented Apr 29, 2020

echesakov commented Jun 23, 2020

echesakov commented May 5, 2021

echesakov commented Oct 18, 2019 •

edited

Loading