-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support fast tailcalls in R2R #56669
Merged
Merged
Changes from 27 commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
7137494
Support fast tailcalls in R2R
jakobbotsch 0efbbc4
Support ARM64
jakobbotsch 5b0127e
Run jit-format
jakobbotsch 2b360c8
Fix non-R2R ARM build
jakobbotsch ae8fd46
Fix recursive-call-to-loop optimization with non-standard args
jakobbotsch e45f0c5
Implement new delay load helper for fast tailcalls
jakobbotsch 7f21f31
Minor changes and fix build break
jakobbotsch 5ce1e65
Switch to a define for tailcall register
jakobbotsch 81268fa
Fix x86
jakobbotsch 8e15c1c
Implement DefType.IsUnsafeValueType
jakobbotsch b971fce
Emit rex.jmp for tailcall jumps on x64
jakobbotsch 483b34a
Refactor non standard args + refix recursive tailcall opt
jakobbotsch b533f4b
Set nonStandardArgKind for stack args also
jakobbotsch 1c0ede7
Merge remote-tracking branch 'upstream/main' into cg2-fast-tailcalls
jakobbotsch 424d538
Regenerate JIT interface
jakobbotsch 24cfa6f
Merge remote-tracking branch 'upstream/main' into cg2-fast-tailcalls
jakobbotsch 0968851
Improve recursion-to-loop decision about which args need to be reassi…
jakobbotsch 91a665a
Merge remote-tracking branch 'upstream/main' into cg2-fast-tailcalls
jakobbotsch b345db7
Use INS_tail_i_jmp for func token indir
jakobbotsch 0816d52
More efficient arm64 VSD fast tailcalls, fix some bad merging
jakobbotsch 863ad4d
Take R2R indirection into account for tail call profitability
jakobbotsch 0778263
Disallow tailcalls via JIT helper in R2R builds
jakobbotsch 92cd381
Merge remote-tracking branch 'upstream/main' into cg2-fast-tailcalls
jakobbotsch 16e32b0
Revert "Take R2R indirection into account for tail call profitability"
jakobbotsch d5729a7
Add SPMI support, clean up mcPackets enum
jakobbotsch b8ad2ba
Merge remote-tracking branch 'upstream/main' into cg2-fast-tailcalls
jakobbotsch 9f70cc4
Fix conflicts
jakobbotsch ad78caf
Take necessary conditions into account in canTailCall
jakobbotsch File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
386 changes: 193 additions & 193 deletions
386
src/coreclr/ToolBox/superpmi/superpmi-shared/methodcontext.h
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3759,15 +3759,19 @@ GenTree* Lowering::LowerDirectCall(GenTreeCall* call) | |
|
||
case IAT_PVALUE: | ||
{ | ||
bool isR2RRelativeIndir = false; | ||
#if defined(FEATURE_READYTORUN) && defined(TARGET_ARMARCH) | ||
bool hasIndirectionCell = false; | ||
#if defined(TARGET_ARMARCH) | ||
// Skip inserting the indirection node to load the address that is already | ||
// computed in REG_R2R_INDIRECT_PARAM as a hidden parameter. Instead during the | ||
// codegen, just load the call target from REG_R2R_INDIRECT_PARAM. | ||
isR2RRelativeIndir = call->IsR2RRelativeIndir(); | ||
#endif // FEATURE_READYTORUN && TARGET_ARMARCH | ||
hasIndirectionCell = call->IsR2RRelativeIndir(); | ||
#elif defined(TARGET_XARCH) | ||
// For xarch we usually get the indirection cell from the return address, | ||
// except for fast tailcalls where we do the same as ARM. | ||
hasIndirectionCell = call->IsR2RRelativeIndir() && call->IsFastTailCall(); | ||
#endif | ||
|
||
if (!isR2RRelativeIndir) | ||
if (!hasIndirectionCell) | ||
{ | ||
// Non-virtual direct calls to addresses accessed by | ||
// a single indirection. | ||
|
@@ -4834,15 +4838,12 @@ GenTree* Lowering::LowerVirtualStubCall(GenTreeCall* call) | |
} | ||
else | ||
{ | ||
|
||
bool shouldOptimizeVirtualStubCall = false; | ||
#if defined(FEATURE_READYTORUN) && defined(TARGET_ARMARCH) | ||
// Skip inserting the indirection node to load the address that is already | ||
// computed in REG_R2R_INDIRECT_PARAM as a hidden parameter. Instead during the | ||
// codegen, just load the call target from REG_R2R_INDIRECT_PARAM. | ||
// However, for tail calls, the call target is always computed in RBM_FASTTAILCALL_TARGET | ||
// and so do not optimize virtual stub calls for such cases. | ||
shouldOptimizeVirtualStubCall = !call->IsTailCall(); | ||
shouldOptimizeVirtualStubCall = true; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice! |
||
#endif // FEATURE_READYTORUN && TARGET_ARMARCH | ||
|
||
if (!shouldOptimizeVirtualStubCall) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about the role of
tmpReg
in the fast tail call case. How does it end up having the right contents?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It happens in
genCall
here: https://github.com/jakobbotsch/runtime/blob/9f70cc42d73467c15a458ee9774589271d721536/src/coreclr/jit/codegenarmarch.cpp#L2340-L2375For normal calls we call
genCall
which then callsgenCallInstruction
that takes care to generate the code to load the call target and do the call.For tailcalls we instead generate the code to load the call target in
genCall
, but this is the last thing that happens when we see theGenTreeCall
node. The remaining work happens during epilog generation which also callsgenCallInstruction
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you could pass REG_NA for the tail call case?
I guess it's
GetSingleTempReg
that is confusing me. It seems odd to "allocate" a temp reg without altering state.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand. The temp reg is allocated during RA for this specific optimization we do on ARM/ARM64, where we have no target node and need an extra register to store the call target loaded from the indirection cell. We still need it for the tail call case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, not that it's not ultimately needed, but that the value of
tmpReg
passed here to GenCall is not used, instead we go callGetSingleTempReg
again.It's not that important, I just found it confusing to follow how the value gets from one place to another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code here in
genCallInstruction
is called last.genCall
also callsGetSingleTempReg
, but it adds the register back togtRsvdRegs
so that this call will succeed:https://github.com/jakobbotsch/runtime/blob/9f70cc42d73467c15a458ee9774589271d721536/src/coreclr/jit/codegenarmarch.cpp#L2369-L2370
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, makes more sense now.