-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ARM64 unsigned div by const perf regression #57400
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsThis fixes #54166. This is the alternative version of #57372 that just improves Diff: https://www.diffchecker.com/9mkVf4se Unfortunately this doesn't yet solve #47490, probably because the operand nodes are extended to 64-bit earlier. cc @AndyAyersMS @echesakovMSFT
|
CC @AndyAyersMS please review the code. |
Changes look reasonable to me, but I'll defer to @echesakov here... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this is a low risk change for .NET 6, but longer term I'd think we'd want to support GT_MUL_LONG
in ARM64 codegen directly and use the existing methods to detect it in lowering.
I ran Here are the results (baseline is e0f6071): BenchmarkDotNet=v0.12.1.1466-nightly, OS=Windows 10.0.19043
Microsoft SQ2 3.15 GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=6.0.100-preview.7.21379.14
[Host] : .NET 5.0.9 (5.0.921.35908), Arm64 RyuJIT
Job-TRWEYK : .NET 6.0.0 (42.42.42.42424), Arm64 RyuJIT
Job-OBGLQV : .NET 6.0.0 (42.42.42.42424), Arm64 RyuJIT
Job-COTIHO : .NET 6.0.0 (42.42.42.42424), Arm64 RyuJIT
PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:DebugType=portable IterationTime=250.0000 ms
MaxIterationCount=20 MinIterationCount=15 WarmupCount=1
|
@echesakov diff seems improvements. Do you approve the change? |
@JulieLeeMSFT Yes, the numbers were mostly improvements with an exception of
|
The spmi-asmdiffs are also mostly positive with an exception of I will take a look at the diff to understand what is going on. benchmarks.run.windows.arm64.checked.mch:
Detail diffs
coreclr_tests.pmi.windows.arm64.checked.mch:
Detail diffs
libraries.crossgen2.windows.arm64.checked.mch:
Detail diffs
libraries.pmi.windows.arm64.checked.mch:
Detail diffs
libraries_tests.pmi.windows.arm64.checked.mch:
Detail diffs
coreclr_tests.pmi.Linux.arm64.checked.mch:
Detail diffs
libraries.crossgen2.Linux.arm64.checked.mch:
Detail diffs
libraries.pmi.Linux.arm64.checked.mch:
Detail diffs
libraries_tests.pmi.Linux.arm64.checked.mch:
Detail diffs
|
@echesakov ToString numbers show some improvments and some regression. So, do you want that to be addressed before approving the code? |
@JulieLeeMSFT I wouldn't worry about |
Thanks for the note @SingleAccretion - I wasn't aware of that issue. An example of such regression - ldr x0, [x22,#0xd1ffab1e]
+ movz x0, #0xd1ffab1e
+ movk x0, #0xd1ffab1e LSL #16
+ movk x0, #0xd1ffab1e LSL #32
+ ldr x0, [x0] |
@pentp Can you please address the feedback by @SingleAccretion, so we can merge the PR before RC1? |
@echesakov do you want to push the changes discussed above? We are running out of time for RC1. |
@pentp I pushed the changes suggested by @SingleAccretion. |
@AndyAyersMS Can you please also take a look? In case I missed anything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too.
runtime (Libraries Test Run checked coreclr windows x86 Release) failure is due to #57471 |
runtime (Installer Build and Test coreclr windows_x64 Release) is an infra issue:
|
Since the change is arm64 specific I think we can merge despite those failures. Anyone disagree? |
Improvements: dotnet/perf-autofiling-issues#828 and dotnet/perf-autofiling-issues#1109 |
This fixes #54166. This is the alternative version of #57372 that just improves
GT_MUL
codegen for arm64. Diffs are identical.Diff: https://www.diffchecker.com/9mkVf4se
Diff with baseline (no #52893): https://www.diffchecker.com/x2SRZJvW
Unfortunately this doesn't yet solve #47490, probably because the operand nodes are extended to 64-bit earlier.
cc @AndyAyersMS @echesakovMSFT