Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Implement stack probing using helpers #27184

Merged
merged 15 commits into from
Oct 29, 2019
Merged

Implement stack probing using helpers #27184

merged 15 commits into from
Oct 29, 2019

Conversation

echesakov
Copy link

This partially addresses https://github.com/dotnet/coreclr/issues/26996

  1. Splits CodeGen::genAllocLclFrame into Arm32 and Arm64 specific functions
  2. Implements stack probing on Arm32 via helpers

Implementation of stack probing via helpers on Arm64 is complicated...
It is related to how we establish frame pointer in a function prolog and the fact that stack probing currently happens before lr is saved on stack.
The latter means that we can not call any function until that moment - every call will previous value of lr.
We also can not defer the stack probing until after we save the frame record (i.e. fp, lr pair) on stack - some types of stack frames store the frame record at the lowest address on the stack.

I tried to consider ways of calling the helper without advancing sp:

  1. Do jump instead of call to the helper (i.e. replacing bl with b). This requires computing return address manually and passing it to the helper. It seems we can't do this in JIT right now (at least I could not find a way to emit INS_adr and specify PCRelOffset). This approach would probably confuse unwinder when SO happens.
  2. Store lr in red zone before call to the helper and restore original value of lr inside the helper. It is not going to work - unwind codes on Arm64 doesn't support negative offsets.
  3. Store original value of lr into another register - not supported by unwind codes.

The only choice I have left is to store lr on the stack, adjust sp, call to the helper and restore lr and sp after the call (or in the helper).

@dotnet/jit-contrib @janvorli I would value your feedback on the proposal. For now I would like to merge Arm32 stack probing logic only.

src/vm/arm/asmhelpers.S Outdated Show resolved Hide resolved
@echesakov
Copy link
Author

echesakov commented Oct 22, 2019

This PR is ready for review. I collected the stack traces for the following test cases:

  1. SO happens when a method allocates local that exceeds the remaining stack size;
  2. SO happens in a funclet prolog when the funclet allocates and probes outgoing argument space that goes beyond the stack boundary.

On win-arm the stack traces looks as I would expect them to be:

win-arm - windbg - case 1:

(2964.41ac): Stack overflow - code c00000fd (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** WARNING: Unable to verify checksum for C:\echesako\coreclr\bin\Product\Windows_NT.arm.Checked\CoreCLR.dll
CoreCLR!JIT_StackProbe+0x10:
0f161f80 f85d5d04 ldr         r5,[sp,#-4]!                    05203000=00000000
0:000> k
*** WARNING: Unable to verify checksum for CoreRun.exe
 # Child-SP RetAddr  Call Site
00 05203004 0ad4ab22 CoreCLR!JIT_StackProbe+0x10 [D:\git\coreclr3\bin\obj\Windows_NT.arm.Checked\src\vm\wks\asmhelpers.asm @ 6589] 
01 0537e0a4 0ac3826e GitHub_21061_StackOverflowInFunctionProlog!GitHub_21061._0017B000.AllocLocal()+0x1e*** WARNING: Unable to verify checksum for C:\echesako\coreclr\GitHub_21061_StackOverflowInFunctionProlog.exe

02 0537e0b8 0f16104c GitHub_21061_StackOverflowInFunctionProlog!GitHub_21061.Program.Main(System.String[])+0x11ca
03 0537e0c0 0f28a870 CoreCLR!CallDescrWorkerInternal+0x45 [D:\git\coreclr3\bin\obj\Windows_NT.arm.Checked\src\vm\wks\asmhelpers.asm @ 4564] 
04 0537e0d0 0f28ad14 CoreCLR!CallDescrWorker+0x99 [D:\git\coreclr3\src\vm\callhelpers.cpp @ 129] 
05 0537e4f8 0f28776e CoreCLR!MethodDescCallSite::CallTargetWorker+0x3a9 [D:\git\coreclr3\src\vm\callhelpers.cpp @ 549] 
06 (Inline) -------- CoreCLR!MethodDescCallSite::Call+0x11 [D:\git\coreclr3\src\vm\callhelpers.h @ 459] 
07 0537e630 0f287520 CoreCLR!RunMainInternal+0x13f [D:\git\coreclr3\src\vm\assembly.cpp @ 1506] 
08 (Inline) -------- CoreCLR!RunMain::__l30::__Body::Run::__l5::__Body::Run+0x7 [D:\git\coreclr3\src\vm\assembly.cpp @ 1577] 
09 0537e738 0f2875c2 CoreCLR!`RunMain'::`30'::__Body::Run+0x2d [D:\git\coreclr3\src\vm\assembly.cpp @ 1579] 
0a 0537e760 0f2861be CoreCLR!RunMain+0x8f [D:\git\coreclr3\src\vm\assembly.cpp @ 1579] 
0b 0537e7b0 0f181a52 CoreCLR!Assembly::ExecuteMainMethod+0x10b [D:\git\coreclr3\src\vm\assembly.cpp @ 1689] 
0c 0537eac0 00d83de8 CoreCLR!CorHost2::ExecuteAssembly+0x193 [D:\git\coreclr3\src\vm\corhost.cpp @ 461] 
0d 0537ebc0 00d84320 CoreRun!TryRun+0x7e5 [D:\git\coreclr3\src\coreclr\hosts\corerun\corerun.cpp @ 697] 
0e 0537fef8 00d92fc0 CoreRun!wmain+0xc5 [D:\git\coreclr3\src\coreclr\hosts\corerun\corerun.cpp @ 815] 
0f 0537ff30 00d92ed2 CoreRun!invoke_main+0x29 [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 90] 
10 0537ff50 00d92de0 CoreRun!__scrt_common_main_seh+0xe3 [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
11 0537ffb0 00d9302c CoreRun!__scrt_common_main+0x11 [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 330] 
12 0537ffc0 724d1df0 CoreRun!wmainCRTStartup+0xd [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_wmain.cpp @ 16] 
13 0537ffd0 779e8f64 KERNEL32!BaseThreadInitThunk+0x21
14 0537ffe8 00000000 ntdll!RtlUserThreadStart+0x35
0:000> r
 r0=00000000  r1=00000000  r2=00000000  r3=0ac30709  r4=05203098  r5=00000000
 r6=0537e5d0  r7=0537e0a0  r8=00000008  r9=00000001 r10=00000000 r11=0537e0b0
r12=0ad4ab05  sp=05203004  lr=0ad4ab23  pc=0f161f80 psr=20000030 --C-- Thumb
CoreCLR!JIT_StackProbe+0x10:
0f161f80 f85d5d04 ldr         r5,[sp,#-4]!                    05203000=00000000

win-arm - windbg - case 2:

(3518.3798): Stack overflow - code c00000fd (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** WARNING: Unable to verify checksum for C:\echesako\coreclr\bin\Product\Windows_NT.arm.Checked\CoreCLR.dll
CoreCLR!JIT_StackProbe+0x10:
0f161f80 f85d5d04 ldr         r5,[sp,#-4]!                    00a03000=00000000
0:000> k
*** WARNING: Unable to verify checksum for CoreRun.exe
 # Child-SP RetAddr  Call Site
00 00a03004 0cc400f0 CoreCLR!JIT_StackProbe+0x10 [D:\git\coreclr3\bin\obj\Windows_NT.arm.Checked\src\vm\wks\asmhelpers.asm @ 6589] 
01 00a13f68 0f1621f4 GitHub_21061_StackOverflowInFuncletProlog!GitHub_21061._00159000.AllocLocal_DivideByZero()+0xbc*** WARNING: Unable to verify checksum for C:\echesako\coreclr\GitHub_21061_StackOverflowInFuncletProlog.exe

02 00a13f80 0f1d4740 CoreCLR!CallEHFunclet+0x15 [D:\git\coreclr3\bin\obj\Windows_NT.arm.Checked\src\vm\wks\ehhelpers.asm @ 4568] 
03 00a13fa8 0f1d45fc CoreCLR!ExceptionTracker::CallHandler+0xe5 [D:\git\coreclr3\src\vm\exceptionhandling.cpp @ 3409] 
04 00a13ff8 0f1db378 CoreCLR!ExceptionTracker::CallCatchHandler+0x111 [D:\git\coreclr3\src\vm\exceptionhandling.cpp @ 656] 
05 00a14050 779c4788 CoreCLR!ProcessCLRException+0x729 [D:\git\coreclr3\src\vm\exceptionhandling.cpp @ 1192] 
06 00a14248 77a4be18 ntdll!_chkstk+0x179
07 00a14250 724dba16 ntdll!RtlUnwindEx+0x169
08 00a14558 0f1d496c KERNEL32!RtlUnwindEx+0x17
09 00a14570 0f1db346 CoreCLR!ClrUnwindEx+0x25 [D:\git\coreclr3\src\vm\exceptionhandling.cpp @ 5340] 
0a 00a14740 779c4740 CoreCLR!ProcessCLRException+0x6f7 [D:\git\coreclr3\src\vm\exceptionhandling.cpp @ 1165] 
0b 00a14938 77a4bb70 ntdll!_chkstk+0x131
0c 00a14940 779c43a8 ntdll!RtlLogStackBackTrace+0x841
0d 00a14c60 7351fe4e ntdll!KiUserExceptionDispatcher+0x9
0e 00a14e60 0f1e51c0 KERNELBASE!RaiseException+0x3f
0f 00a14ec0 0f1e3ce8 CoreCLR!`RaiseTheExceptionInternalOnly'::`53'::__Body::Run+0x115 [D:\git\coreclr3\src\vm\excep.cpp @ 2954] 
10 00a14ef8 0f1e8ed0 CoreCLR!RaiseTheExceptionInternalOnly+0x279 [D:\git\coreclr3\src\vm\excep.cpp @ 2954] 
11 00a14fa8 0f2690f2 CoreCLR!UnwindAndContinueRethrowHelperAfterCatch+0x85 [D:\git\coreclr3\src\vm\excep.cpp @ 8216] 
12 00a14fd8 0f20c9b6 CoreCLR!__FCThrow+0xa7 [D:\git\coreclr3\src\vm\fcall.cpp @ 29] 
13 00a150b0 0cc400ae CoreCLR!JIT_Div+0x37 [D:\git\coreclr3\src\vm\jithelpers.cpp @ 276] 
14 00a150c8 0b350286 GitHub_21061_StackOverflowInFuncletProlog!GitHub_21061._00159000.AllocLocal_DivideByZero()+0x7a
15 00b7e0b8 0f16104c GitHub_21061_StackOverflowInFuncletProlog!GitHub_21061.Program.Main(System.String[])+0x252
16 00b7e0c0 0f28a870 CoreCLR!CallDescrWorkerInternal+0x45 [D:\git\coreclr3\bin\obj\Windows_NT.arm.Checked\src\vm\wks\asmhelpers.asm @ 4564] 
17 00b7e0d0 0f28ad14 CoreCLR!CallDescrWorker+0x99 [D:\git\coreclr3\src\vm\callhelpers.cpp @ 129] 
18 00b7e4f8 0f28776e CoreCLR!MethodDescCallSite::CallTargetWorker+0x3a9 [D:\git\coreclr3\src\vm\callhelpers.cpp @ 549] 
19 (Inline) -------- CoreCLR!MethodDescCallSite::Call+0x11 [D:\git\coreclr3\src\vm\callhelpers.h @ 459] 
1a 00b7e630 0f287520 CoreCLR!RunMainInternal+0x13f [D:\git\coreclr3\src\vm\assembly.cpp @ 1506] 
1b (Inline) -------- CoreCLR!RunMain::__l30::__Body::Run::__l5::__Body::Run+0x7 [D:\git\coreclr3\src\vm\assembly.cpp @ 1577] 
1c 00b7e738 0f2875c2 CoreCLR!`RunMain'::`30'::__Body::Run+0x2d [D:\git\coreclr3\src\vm\assembly.cpp @ 1579] 
1d 00b7e760 0f2861be CoreCLR!RunMain+0x8f [D:\git\coreclr3\src\vm\assembly.cpp @ 1579] 
1e 00b7e7b0 0f181a52 CoreCLR!Assembly::ExecuteMainMethod+0x10b [D:\git\coreclr3\src\vm\assembly.cpp @ 1689] 
1f 00b7eac0 00d83de8 CoreCLR!CorHost2::ExecuteAssembly+0x193 [D:\git\coreclr3\src\vm\corhost.cpp @ 461] 
20 00b7ebc0 00d84320 CoreRun!TryRun+0x7e5 [D:\git\coreclr3\src\coreclr\hosts\corerun\corerun.cpp @ 697] 
21 00b7fef8 00d92fc0 CoreRun!wmain+0xc5 [D:\git\coreclr3\src\coreclr\hosts\corerun\corerun.cpp @ 815] 
22 00b7ff30 00d92ed2 CoreRun!invoke_main+0x29 [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 90] 
23 00b7ff50 00d92de0 CoreRun!__scrt_common_main_seh+0xe3 [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
24 00b7ffb0 00d9302c CoreRun!__scrt_common_main+0x11 [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 330] 
25 00b7ffc0 724d1df0 CoreRun!wmainCRTStartup+0xd [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_wmain.cpp @ 16] 
26 00b7ffd0 779e8f64 KERNEL32!BaseThreadInitThunk+0x21
27 00b7ffe8 00000000 ntdll!RtlUserThreadStart+0x35
0:000> r
 r0=0737f678  r1=0cc400db  r2=00a14344  r3=0536c3f0  r4=00a03fa0  r5=00000000
 r6=00b7e5d0  r7=00a13f64  r8=00000008  r9=00000001 r10=0000ffcc r11=00b7e0b0
r12=00000023  sp=00a03004  lr=0cc400f1  pc=0f161f80 psr=20000030 --C-- Thumb
CoreCLR!JIT_StackProbe+0x10:
0f161f80 f85d5d04 ldr         r5,[sp,#-4]!                    00a03000=00000000

On linux-arm in case 2 for some reason the debugger can not unwind beyond CallDescrWorkerInternal helper. I checked that it happens even without my changes.

linux-arm - lldb - case 1:

(lldb) bt
* thread #1, name = 'corerun', stop reason = signal SIGSEGV: invalid address (fault address: 0xbe9ff000)
  * frame #0: 0xb675c686 libcoreclr.so`JIT_StackProbe at asmhelpers.S:1480
    frame #1: 0xb27318a6
    frame #2: 0xb272ea16
    frame #3: 0xb675bbba libcoreclr.so`CallDescrWorkerInternal at asmhelpers.S:79
    frame #4: 0xb665aafe libcoreclr.so`CallDescrWorker(pCallDescrData=0xbeffeda8) at callhelpers.cpp:126
    frame #5: 0xb665a9aa libcoreclr.so`CallDescrWorkerWithHandler(pCallDescrData=0xbeffeda8, fCriticalCall=NO) at callhelpers.cpp:70
    frame #6: 0xb665b0b0 libcoreclr.so`MethodDescCallSite::CallTargetWorker(this=<unavailable>, pArguments=0x00000000, pReturnValue=0x00000000, cbReturnValue=0) at callhelpers.cpp:546
    frame #7: 0xb6770618 libcoreclr.so`RunMain(MethodDesc*, short, int*, REF<PtrArray>*) [inlined] MethodDescCallSite::Call(this=0xb4f3e784, pArguments=0xb2e00bc8) at callhelpers.h:459
    frame #8: 0xb6770610 libcoreclr.so`RunMain(MethodDesc*, short, int*, REF<PtrArray>*) at assembly.cpp:1505
    frame #9: 0xb67704d0 libcoreclr.so`RunMain(MethodDesc*, short, int*, REF<PtrArray>*) [inlined] RunMain(MethodDesc*, short, int*, REF<PtrArray>*)::$_1::operator()(Param*) const::'lambda'(Param*)::operator()(Param*) const at assembly.cpp:1577
    frame #10: 0xb67704d0 libcoreclr.so`RunMain(MethodDesc*, short, int*, REF<PtrArray>*) at assembly.cpp:1579
    frame #11: 0xb6770466 libcoreclr.so`RunMain(pFD=<unavailable>, numSkipArgs=<unavailable>, piRetVal=<unavailable>, stringArgs=<unavailable>) at assembly.cpp:1579
    frame #12: 0xb67708c2 libcoreclr.so`Assembly::ExecuteMainMethod(this=<unavailable>, stringArgs=0xbefff278, waitForOtherThreads=YES) at assembly.cpp:1689
    frame #13: 0xb6588958 libcoreclr.so`CorHost2::ExecuteAssembly(this=<unavailable>, dwAppDomainId=<unavailable>, pwzAssemblyPath=<unavailable>, argc=0, argv=<unavailable>, pReturnValue=0xbefff344) at corhost.cpp:460
    frame #14: 0xb656132a libcoreclr.so`::coreclr_execute_assembly(hostHandle=0x0041fc28, domainId=1, argc=0, argv=<unavailable>, managedAssemblyPath=<unavailable>, exitCode=0xbefff344) at unixinterface.cpp:407
    frame #15: 0x0040219c corerun`ExecuteManagedAssembly(currentExeAbsolutePath=0x0041a064, clrFilesAbsolutePath=<unavailable>, managedAssemblyAbsolutePath=0x0041a12c, managedAssemblyArgc=<unavailable>, managedAssemblyArgv=<unavailable>) at coreruncommon.cpp:476
    frame #16: 0x0040143c corerun`corerun(argc=<unavailable>, argv=<unavailable>) at corerun.cpp:149
    frame #17: 0xb6cfffe6 libc.so.6`__libc_start_main(main=(corerun`main + 1 at corerun.cpp:161), argc=2, argv=0xbefff584, init=<unavailable>, fini=(corerun`__libc_csu_fini + 1), rtld_fini=(ld-2.27.so`_dl_fini + 1 at dl-fini.c:50), stack_end=0xbefff584) at libc-start.c:310
    frame #18: 0x00401150 corerun`_start + 52
(lldb) disassemble -A thumbv7
libcoreclr.so`JIT_StackProbe:
    0xb675c676 <+0>:  push   {r7}
    0xb675c678 <+2>:  mov    r7, sp
    0xb675c67a <+4>:  mov    r5, sp
    0xb675c67c <+6>:  bfc    r5, #0, #12
    0xb675c680 <+10>: mov    sp, r5
    0xb675c682 <+12>: subw   sp, sp, #0xffc
->  0xb675c686 <+16>: ldr    r5, [sp, #-4]!
    0xb675c68a <+20>: cmp    sp, r4
    0xb675c68c <+22>: bhi    0xb675c682                ; <+12>
    0xb675c68e <+24>: mov    sp, r7
    0xb675c690 <+26>: pop    {r7}
    0xb675c692 <+28>: bx     lr
(lldb) re r
General Purpose Registers:
        r0 = 0x00000000
        r1 = 0x00000000
        r2 = 0xffffffff
        r3 = 0xb575c229
        r4 = 0xbe9ff880
        r5 = 0x00000000
        r6 = 0x00450788
        r7 = 0xbeffe888
        r8 = 0x00000000
        r9 = 0xbeffed00
       r10 = 0xbeffef00
       r11 = 0xbeffe898
       r12 = 0xb2731889
        sp = 0xbe9ff004
        lr = 0xb27318a7
        pc = 0xb675c686  libcoreclr.so`JIT_StackProbe + 16
      cpsr = 0x200d0030

linux-arm - lldb - case 2:

* thread #1, name = 'corerun', stop reason = signal SIGSEGV: invalid address (fault address: 0xbe9ff000)
    frame #0: 0xb675c686 libcoreclr.so`JIT_StackProbe at asmhelpers.S:1480
(lldb) bt
* thread #1, name = 'corerun', stop reason = signal SIGSEGV: invalid address (fault address: 0xbe9ff000)
  * frame #0: 0xb675c686 libcoreclr.so`JIT_StackProbe at asmhelpers.S:1480
    frame #1: 0xac6870f0
    frame #2: 0xb271e87e
    frame #3: 0xb675bbba libcoreclr.so`CallDescrWorkerInternal at asmhelpers.S:79
(lldb) re r
General Purpose Registers:
        r0 = 0xb2e0611c
        r1 = 0xac6870db
        r2 = 0xbea10464
        r3 = 0x0044e820
        r4 = 0xbe9ffa40
        r5 = 0x00000000
        r6 = 0x00450788
        r7 = 0xbea0fa04
        r8 = 0x00000000
        r9 = 0xbeffed00
       r10 = 0x0000ffcc
       r11 = 0xbeffe898
       r12 = 0xb6c954bc
        sp = 0xbe9ff004
        lr = 0xac6870f1
        pc = 0xb675c686  libcoreclr.so`JIT_StackProbe + 16
      cpsr = 0x200f0030

(lldb) disassemble -A thumbv7
libcoreclr.so`JIT_StackProbe:
    0xb675c676 <+0>:  push   {r7}
    0xb675c678 <+2>:  mov    r7, sp
    0xb675c67a <+4>:  mov    r5, sp
    0xb675c67c <+6>:  bfc    r5, #0, #12
    0xb675c680 <+10>: mov    sp, r5
    0xb675c682 <+12>: subw   sp, sp, #0xffc
->  0xb675c686 <+16>: ldr    r5, [sp, #-4]!
    0xb675c68a <+20>: cmp    sp, r4
    0xb675c68c <+22>: bhi    0xb675c682                ; <+12>
    0xb675c68e <+24>: mov    sp, r7
    0xb675c690 <+26>: pop    {r7}
    0xb675c692 <+28>: bx     lr

I also fixed the helpers as Jan suggested above so they would probe at the bottom of the pages (i.e. at addresses 0xYYYYY000) and sp stays 4-byte aligned all the time.

@echesakov
Copy link
Author

/azp run coreclr-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@echesakov echesakov requested a review from a team October 25, 2019 16:14
Copy link
Member

@erozenfeld erozenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few comment notes.

src/jit/codegenarm.cpp Show resolved Hide resolved
src/jit/codegenarm64.cpp Outdated Show resolved Hide resolved
src/jit/codegenarm64.cpp Show resolved Hide resolved
@echesakov echesakov merged commit eae780c into dotnet:master Oct 29, 2019
@echesakov echesakov deleted the JitStackProbeHelperArmArch branch October 29, 2019 02:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants