Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

[Arm64/Windows] Backport #9500 changes #10190

Merged
merged 5 commits into from
Apr 4, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion src/vm/arm64/CallDescrWorkerARM64.asm
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@
;;void CallDescrWorkerInternal(CallDescrData * pCallDescrData);
NESTED_ENTRY CallDescrWorkerInternal,,CallDescrWorkerUnwindFrameChainHandler
PROLOG_SAVE_REG_PAIR fp, lr, #-32!
PROLOG_SAVE_REG x19, #16 ;the stack slot at sp+24 is empty for 16 byte alligment
PROLOG_SAVE_REG x19, #16
PROLOG_SAVE_REG x0, #24

mov x19, x0 ; save pCallDescrData in x19

Expand Down Expand Up @@ -76,6 +77,8 @@ LNoFloatingPoint
ldr x9, [x19,#CallDescrData__pTarget]
blr x9

ldr x19, [fp, 24] ; Fixup corrupted X19 callee preserved register
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x19 is callee saved so it should not be trashed on return from call. Can you explain who/when is x19 getting corrupted?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is a workaround for unexpected x19 corruption. I had tried to debug this failure by storing the correct value for x19 on the stack above and adding a conditional break, but was unable to do so. The test seemed to fail to make forward progress.

Let me try to reproduce. I may have better luck as the GC is more stable now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing the equivalent of this line, there are 43 failures on [Arm64/Unix] tip.

./JIT.Directed.Misc.function_pointer.MutualThdRecur-fptr.MutualThdRecur-fptr.sh
./JIT.Directed.tailcall.tailcall.tailcall.sh
./JIT.Methodical.Boxing.misc._dbgconcurgc_il._dbgconcurgc_il.sh
./JIT.Methodical.Boxing.misc._relconcurgc_il._relconcurgc_il.sh
./JIT.Methodical.switch.switch10.switch10.sh
./JIT.Methodical.switch.switch11.switch11.sh
./JIT.Methodical.switch.switch1.switch1.sh
./JIT.Methodical.switch.switch2.switch2.sh
./JIT.Methodical.switch.switch3.switch3.sh
./JIT.Methodical.switch.switch4.switch4.sh
./JIT.Methodical.switch.switch5.switch5.sh
./JIT.Methodical.switch.switch6.switch6.sh
./JIT.Methodical.switch.switch7.switch7.sh
./JIT.Methodical.switch.switch8.switch8.sh
./JIT.Methodical.switch.switch9.switch9.sh
./JIT.Methodical.tailcall._il_dbgrecurse_ep_void._il_dbgrecurse_ep_void.sh
./JIT.Methodical.tailcall._il_dbgtest_2a._il_dbgtest_2a.sh
./JIT.Methodical.tailcall._il_dbgtest_2b._il_dbgtest_2b.sh
./JIT.Methodical.tailcall._il_dbgtest_2c._il_dbgtest_2c.sh
./JIT.Methodical.tailcall._il_dbgtest_implicit._il_dbgtest_implicit.sh
./JIT.Methodical.tailcall._il_dbgtest_mutual_rec._il_dbgtest_mutual_rec.sh
./JIT.Methodical.tailcall._il_dbgtest_switch._il_dbgtest_switch.sh
./JIT.Methodical.tailcall._il_dbgtest_virt._il_dbgtest_virt.sh
./JIT.Methodical.tailcall._il_dbgtest_void._il_dbgtest_void.sh
./JIT.Methodical.tailcall._il_relrecurse_ep_void._il_relrecurse_ep_void.sh
./JIT.Methodical.tailcall._il_reltest_2a._il_reltest_2a.sh
./JIT.Methodical.tailcall._il_reltest_2b._il_reltest_2b.sh
./JIT.Methodical.tailcall._il_reltest_2c._il_reltest_2c.sh
./JIT.Methodical.tailcall._il_reltest_implicit._il_reltest_implicit.sh
./JIT.Methodical.tailcall._il_reltest_mutual_rec._il_reltest_mutual_rec.sh
./JIT.Methodical.tailcall._il_reltest_switch._il_reltest_switch.sh
./JIT.Methodical.tailcall._il_reltest_virt._il_reltest_virt.sh
./JIT.Methodical.tailcall._il_reltest_void._il_reltest_void.sh
./JIT.Regression.CLR-x86-JIT.V1.1-M1-Beta1.b143840.b143840.b143840.sh
./JIT.Regression.CLR-x86-JIT.V1-M12-Beta2.b65423.b65423.b65423.sh
./JIT.Regression.CLR-x86-JIT.V1-QFE.b151440.static-none.static-none.sh
./JIT.Regression.CLR-x86-JIT.V2.0-Beta2.b426654.b426654.b426654.sh
./JIT.Regression.Dev11.dev11_10427.conv_ovf_i4.conv_ovf_i4.sh
./JIT.Regression.VS-ia64-JIT.V1.2-M02.b14364.b14364.b14364.sh
./Loader.classloader.TypeInitialization.CircularCctors.CircularCctorFourThreadsBFI.CircularCctorFourThreadsBFI.sh
./Loader.classloader.TypeInitialization.CircularCctors.CircularCctorThreeThreads01BFI.CircularCctorThreeThreads01BFI.sh
./Loader.classloader.TypeInitialization.CircularCctors.CircularCctorThreeThreads02BFI.CircularCctorThreeThreads02BFI.sh
./Loader.classloader.TypeInitialization.CircularCctors.CircularCctorThreeThreads03BFI.CircularCctorThreeThreads03BFI.sh

The failure signatures are not exactly what I remember. I will attempt to debug a shorter one to see if I can root cause the issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the failures are a regression on the tip . When the regression is fixed, looks like I may be able to remove this workaround from the [Arm64/Unix] assembly.


ldr w3, [x19,#CallDescrData__fpReturnSize]

;; Int return case
Expand Down
36 changes: 26 additions & 10 deletions src/vm/arm64/asmhelpers.asm
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,7 @@
RestoreRegMS 26, X26
RestoreRegMS 27, X27
RestoreRegMS 28, X28
RestoreRegMS 29, X29

Done
; Its imperative that the return value of HelperMethodFrameRestoreState is zero
Expand Down Expand Up @@ -991,33 +992,48 @@ UM2MThunk_WrapperHelper_RegArgumentsSetup

; This helper enables us to call into a funclet after restoring Fp register
NESTED_ENTRY CallEHFunclet

; Using below prolog instead of PROLOG_SAVE_REG_PAIR fp,lr, #-16!
; is intentional. Above statement would also emit instruction to save
; sp in fp. If sp is saved in fp in prolog then it is not expected that fp can change in the body
; of method. However, this method needs to be able to change fp before calling funclet.
; This is required to access locals in funclet.
PROLOG_SAVE_REG_PAIR x19,x20, #-16!
PROLOG_SAVE_REG fp, #0
PROLOG_SAVE_REG lr, #8

; On entry:
;
; X0 = throwable
; X1 = PC to invoke
; X2 = address of X19 register in CONTEXT record; used to restore the non-volatile registers of CrawlFrame
; X3 = address of the location where the SP of funclet's caller (i.e. this helper) should be saved.
;

; Using below prolog instead of PROLOG_SAVE_REG_PAIR fp,lr, #-16!
; is intentional. Above statement would also emit instruction to save
; sp in fp. If sp is saved in fp in prolog then it is not expected that fp can change in the body
; of method. However, this method needs to be able to change fp before calling funclet.
; This is required to access locals in funclet.
PROLOG_SAVE_REG_PAIR x29,lr, #-96!
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing test is causing an exception to happen in a finalizer (being executed as part of unwind). At that point OS is not able to unwind correctly out of CallEHFunction. This is because CallEHFunction modifies fp outside of the prolog of the function. CallEHFunction needs to be able to modify fp outside of the prolog to set the correct fp in order to get the locals for the finalizer. In order to make that happen prolog must not assume that fp can be used to unwind out of the function.
This line translates to
stp x29,lr [sp,#-0x60]!
mov fp,sp
So generated unwind codes assume that fp can be used to unwind out of the function. Therefor using x29/fp in this line is not correct. If you revert this line back to what it was previously then things work fine
PROLOG_SAVE_REG_PAIR x19,x20, #-96!
PROLOG_SAVE_REG fp, #0
PROLOG_SAVE_REG lr, #8

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Using x29 instead of FP was able to trick the unixmacro preprocessor to avoid that unwinding code. I do not see the Windows version of the macros, they must be part of the ksarm64.h header.

@rahku Do you want me to push the patch?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows has another macro called PROLOG_SAVE_REG_PAIR_NO_FP that could be used here instead of doing this trick. @rahku do you think it would make sense to use it instead?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that works too. I just verified that. I think this macro was not present when I did this work initially. @sdmaclea can you update this PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure


; Spill callee saved registers
PROLOG_SAVE_REG_PAIR x19, x20, 16
PROLOG_SAVE_REG_PAIR x21, x22, 32
PROLOG_SAVE_REG_PAIR x23, x24, 48
PROLOG_SAVE_REG_PAIR x25, x26, 64
PROLOG_SAVE_REG_PAIR x27, x28, 80

; Save the SP of this function. We cannot store SP directly.
mov fp, sp
str fp, [x3]

ldp x19, x20, [x2, #0]
ldp x21, x22, [x2, #16]
ldp x23, x24, [x2, #32]
ldp x25, x26, [x2, #48]
ldp x27, x28, [x2, #64]
ldr fp, [x2, #80] ; offset of fp in CONTEXT relative to X19

; Invoke the funclet
blr x1
nop

EPILOG_RESTORE_REG_PAIR x19, x20, 16
EPILOG_RESTORE_REG_PAIR x21, x22, 32
EPILOG_RESTORE_REG_PAIR x23, x24, 48
EPILOG_RESTORE_REG_PAIR x25, x26, 64
EPILOG_RESTORE_REG_PAIR x27, x28, 80
EPILOG_RESTORE_REG_PAIR fp, lr, #16!
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be EPILOG_RESTORE_REG_PAIR fp, lr, #96!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rahku Yes you are right. Show how dependent I am on compile test workflow....

EPILOG_RETURN

Expand Down
2 changes: 1 addition & 1 deletion src/vm/arm64/calldescrworkerarm64.S
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ LOCAL_LABEL(NoFloatingPoint):
ldr x9, [x19,#CallDescrData__pTarget]
blr x9

ldr x19, [fp, 24] // Fixup corrupted X19 caller preserved register
ldr x19, [fp, 24] // Fixup corrupted X19 callee preserved register

ldr w3, [x19,#CallDescrData__fpReturnSize]

Expand Down