-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XARCH: Remove redudant tests for GT_LT/GT_GE relops. #61152
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsThis PR addresses #11414 but continues on with some of the work discussed in the Background section below. ChangesWe can now optimize cases such as
now transform to
This transform is valid for signed GE and signed LT only. BackgroundIt looks like dotnet/coreclr#14027 started some logic for reusing flags during the lowering stage, at which point some of the cases had to be restricted due to the OF #9059. There was some discussion about making this work for More recently, a PR #38586 added flag tracking during the codegen/emit stage, and addressed some remaining optimization cases, e.g., I think with PR 38586, we can handle cases like ResultsThe following is from an older run with superpmi (happy to grab an updated one if needed). Overall, the change is an improvement. From inspecting some of the diffs, the absence of test instruction impacts some of the loop alignment. A lot of [align X bytes] were optimized out, but in some cases the change actually forced the insertion of alignment. Also, although this is an edge case, I noticed that with inlining there was a greater amount of
|
8982a2e
to
2af00e1
Compare
2af00e1
to
56833ae
Compare
We can now optimize cases such as `(x + y < 0)` or `for (int x = v; x >= 0; x--)` using the flag tracking logic during the emit stage. Notably, cases that would generate... ``` add reg0, reg1 test reg0, reg0 jge LABEL ``` now transform to ``` add reg0, reg1 jns LABEL ``` This transform is valid for signed GE and signed LT only.
56833ae
to
7fbe973
Compare
I believe the Linux x64 test |
Hi @JulieLeeMSFT , can you please assign this to someone who can review the PR? |
Hi @anthonycanino, I'll take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @dotnet/jit-contrib
Overall this looks good to me. I'll run some extra CI legs here for you just in case.
Can you add the diff summary report (artifacts\spmi\diff_summary.md)?
{ | ||
assert(reg != REG_NA); | ||
|
||
// Don't look back across IG boundaries (possible control flow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a check like
if (!emitComp->opts.OptimizationEnabled())
{
return false;
}
(or else assert we're optimizing)
I realize this is also guaranteed by the way the current caller sets canReuseFlags
but would prefer we check these things closer to where we actually do the transformation.
(and add similar to AreFlagsSetToZeroCmp
if you don't mind)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in both spots. Thanks!
else if (canReuseFlags && emit->AreFlagsSetForSignJumpOpt(op1->GetRegNum(), emitTypeSize(type), tree)) | ||
{ | ||
JITDUMP("Not emitting compare due to sign being already set, follow up instr will transform jump\n"); | ||
tree->gtFlags |= GTF_RELOP_SJUMP_OPT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you either always clear this new flag up above before the if
on line 6231, or assert there that it's not set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the assertion next to the other ones. Please let me know if the comment needs to change or if its good enough.
insFormat fmt = id->idInsFmt(); | ||
|
||
// make sure op1 is a reg | ||
switch (fmt) | ||
{ | ||
case IF_RWR_CNS: | ||
case IF_RRW_CNS: | ||
case IF_RRW_SHF: | ||
case IF_RWR_RRD: | ||
case IF_RRW_RRD: | ||
case IF_RWR_MRD: | ||
case IF_RWR_SRD: | ||
case IF_RRW_SRD: | ||
case IF_RWR_ARD: | ||
case IF_RRW_ARD: | ||
case IF_RWR: | ||
case IF_RRD: | ||
case IF_RRW: | ||
break; | ||
default: | ||
return false; | ||
} | ||
|
||
if (id->idReg1() != reg) | ||
{ | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should think about extracting things that are common as helper methods. But maybe do this as a zero-diff follow up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, happy to do a follow up refactor.
/azp run Fuzzlyn |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
x86 Fuzzlyn failures seem unrelated, arm failures certainly are. Linux x64 failures are instances of #61237 which is unrelated and has been fixed. So things look good on the testing front. |
Thanks @AndyAyersMS . Here is the diff_summary.md report. I will respond to your comments inline. aspnet.run.windows.x64.checked.mch:
Detail diffs
benchmarks.run.windows.x64.checked.mch:
Detail diffs
coreclr_tests.pmi.windows.x64.checked.mch:
Detail diffs
libraries.crossgen2.windows.x64.checked.mch:
Detail diffs
libraries.pmi.windows.x64.checked.mch:
Detail diffs
libraries_tests.pmi.windows.x64.checked.mch:
Detail diffs
|
0003781
to
e0fc4e4
Compare
e0fc4e4
to
19b3a91
Compare
One way to verify the real impact would be turning off loop alignment for both baseline/diff using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM...@tannergooding, you mind taking another look?
} | ||
|
||
// Don't look back across IG boundaries (possible control flow) | ||
if (emitCurIGinsCnt == 0 && ((emitCurIG->igFlags & IGF_EXTEND) == 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check emitCurIGinsCnt == 0 && ((emitCurIG->igFlags & IGF_EXTEND) == 0)
is becoming more and more popular (we already do it at 6 different places), can you extract this in a method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok to have a follow-up PR as Andy pointed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can do this with a follow-up PR. Happy to help.
Arm regex test timed out, seeing if retry will fix things. |
@AndyAyersMS looks like everything passed now. Thanks for rerunning that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Will hold off merging for a while to give @tannergooding a chance to review too.
Ok, thanks for reviewing @AndyAyersMS and @kunalspathak ! |
@tannergooding do you want to look this over? If not I'll go ahead and merge. |
@anthonycanino thanks for the contribution! |
This PR addresses #11414 but continues on with some of the work discussed in the Background section below.
Changes
We can now optimize cases such as
(x + y < 0)
orfor (int x = v; x >= 0; x--)
using the flag tracking logic during the emit stage. Notably, cases that
would generate...
now transform to
This transform is valid for signed GE and signed LT only.
Background
It looks like dotnet/coreclr#14027 started some logic for reusing flags during the lowering stage, at which point some of the cases had to be restricted due to the OF #9059. There was some discussion about making this work for
ADD/SUB + LT/GE
but it looks like it was abandoned due to low optimization opportunity #6794 (comment).More recently, a PR #38586 added flag tracking during the codegen/emit stage, and addressed some remaining optimization cases, e.g.,
(x ^ y) < 0
.I think with PR 38586, we can handle cases like
if (x - y) < 0
, or more importantly,for (int x = v; x >= 0; x -= 2)
as we can now look at whether the previous instruction set the SF, and if we are in a GT_GE/GT_LT with a SLT,/SGE and instead emit a JS/JNS instead.Results
The following is from an older run with superpmi (happy to grab an updated one if needed).
Overall, the change is an improvement. From inspecting some of the diffs, the absence of test instruction impacts some of the loop alignment. A lot of [align X bytes] were optimized out, but in some cases the change actually forced the insertion of alignment. Also, although this is an edge case, I noticed that with inlining there was a greater amount of
(x - y) < 0
cases, as I saw a lot ofMath.abs(x - y)
, which inlined to a case where we could remove a redundant check.