-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Enable inlining for late devirtualization #110827
Conversation
6173a81
to
e287f69
Compare
e287f69
to
686f64e
Compare
cc @AndyAyersMS |
Skimmed the changes and it seems promising. I'll take a deeper look soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is in good shape, just one last small thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM -- @jakobbotsch any other feedback?
I'm testing TP impact of splitting trees in #111896. If it turns out that the TP impact has the similar pattern with the check in this PR, I will merge it into this branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well as is
Seems that TP with splitting the tree doesn't have meaningful change: https://dev.azure.com/dnceng-public/public/_build/results?buildId=931974&view=ms.vss-build-web.run-extensions-tab Merged the commits into this PR. |
src/coreclr/jit/fginline.cpp
Outdated
{ | ||
Statement* newStmt = nullptr; | ||
GenTree** callUse = nullptr; | ||
if (m_compiler->gtSplitTree(m_compiler->compCurBB, m_curStmt, call, &newStmt, &callUse)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gtSplitTree
needs to learn that it has to split lvHasLdAddrOp
locals when it runs early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/coreclr/jit/fginline.cpp
Outdated
{ | ||
Statement* newStmt = nullptr; | ||
GenTree** callUse = nullptr; | ||
if (m_compiler->gtSplitTree(m_compiler->compCurBB, m_curStmt, call, &newStmt, &callUse, true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using context != nullptr
as a check if devirtualization succeeded? A better check is !call->IsVirtual()
.
We should avoid splitting the tree if devirtualization fails, and also if the call can't be made into an inline candidate (seems like impMarkInlineCandidate
can be called on a non-root call...?)
So maybe reorder things a bit.
I'd also be fine if you revert all this and go back to the non-split version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it's not obvious that splitting a tree that we're walking will work out as expected. I think it's ok here, especially since we back up and will reprocess the current tree, but it feels like it would be cleaner if we aborted the walk at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah context
will be nullptr
if devirtualization failed.
Switched to use !call.IsVirtual()
instead. The context will be used while calling impMarkInlineCandidate
so I added an assertion to make sure it's correctly set.
Resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Will give @jakobbotsch one last chance to look too.
{ | ||
Statement* newStmt = nullptr; | ||
GenTree** callUse = nullptr; | ||
if (m_compiler->gtSplitTree(m_compiler->compCurBB, m_curStmt, call, &newStmt, &callUse, true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking this split is unnecessarily conservative: it will introduce temps for the arguments of the call, which can bypass optimizations the inliner can make depending on the specific shape of the argument. Fixing that requires enhancing gtSplitTree
a bit (see e.g. my version in this branch), however, even with that done you will run into issues around incorrect treatment of the retbuffer in the inliner. So I wouldn't suggest trying to do anything about it at this point, but it is a potential future improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@hez2010 thanks again! |
* main: Make cdac APIs public but experimental (dotnet#111180) JIT: Enable inlining for late devirtualization (dotnet#110827) Remove unsafe `bool` casts (dotnet#111024) Fix shimmed implementation of TryGetHashAndReset to handle HMAC.
} | ||
|
||
call->GetSingleInlineCandidateInfo()->exactContextHandle = context; | ||
INDEBUG(call->GetSingleInlineCandidateInfo()->inlinersContext = call->gtInlineContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hez2010 I think this Debug logic is causing Debug and Release JITs to report differing amounts of metadata to SuperPMI. runtime-coreclr superpmi-asmdiffs-checked-release broke around the time this PR was merged, and I'm not getting any diffs locally when using builds from the commit before this one.
cc @AndyAyersMS @jakobbotsch. I'm not sure what the ideal fix is here, since GenTreeCall::gtInlineContext
is Debug-only, and removing this logic altogether triggers Debug asserts in InlineStrategy::NewContext
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may need to move call->gtInlineContext
out of #if DEBUG
and always set call->GetSingleInlineCandidateInfo()->inlinersContext
accordingly, otherwise inlinersContext
can sometimes be incorrect because we don't keep tracking it after the importer is done, and leading to inlining failure (which will be caught silently and reverted so it won't affect the correctness of codegen). This may explain why you are seeing different amounts of metadata to SuperPMI, where in Release we inline fewer methods due to the incorrect inlinersContext
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the information be put into LateDevirtualizationInfo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have avoided trying to add more fields to GenTreeCall that are only needed for some small subset of all calls.
What @jakobbotsch suggests makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I revised this issue and found that the issue is that the debug logic should really be
INDEBUG(call->gtInlineContext = call->GetSingleInlineCandidateInfo()->inlinersContext);
instead.
We set a new inlinersContext
for the inline candidate, but didn't update the original InlineContext
for the call. Sorry for this dumb typo mistake... I should really have noticed this before.
The fix is now live: #112396
If we see a new inline candidate after late devirtualization, we can try marking and inlining it.
This unblocks the inlining and stack-allocating arbitrary ref-class enumerators (unless the inliner considers
GetEnumerator
as non-profitable). We might need to tune the inliner heuristics later to get more profitable inlining opportunities.Contributes to #7541 and #108913
There're some really nice diffs: https://gist.github.com/MihuBot/29f7c64533ac1f38494fbfab361ab505
Example:
Before:
GDV:
After:
/cc: @AndyAyersMS