Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Regressions in BenchmarksGame.FannkuchRedux_5 #68822

Closed
performanceautofiler bot opened this issue May 3, 2022 · 9 comments
Closed

[Perf] Regressions in BenchmarksGame.FannkuchRedux_5 #68822

performanceautofiler bot opened this issue May 3, 2022 · 9 comments
Assignees
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 8006e6a89bc02e410331e6323e3f6321b224b327
Compare e4163ea55ebb3673c29e1c2a850a6a790029d278
Diff Diff

Regressions in BenchmarksGame.FannkuchRedux_5

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
RunBench - Duration of single invocation 23.95 ms 26.24 ms 1.10 0.01 True

Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'BenchmarksGame.FannkuchRedux_5*'

Payloads

Baseline
Compare

Histogram

BenchmarksGame.FannkuchRedux_5.RunBench(n: 10, expectedSum: 38)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 26.23831273571429 > 25.0471650415.
IsChangePoint: Marked as a change because one of 4/26/2022 12:44:27 PM, 5/3/2022 11:42:39 AM falls between 4/24/2022 10:52:31 PM and 5/3/2022 11:42:39 AM.
IsRegressionStdDev: Marked as regression because -26.310636692824072 (T) = (0 -25737862.459520478) / Math.Sqrt((86587725068.70877 / (27)) + (81681730737.5921 / (36))) is less than -1.9996235849941724 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (36) - 2, .025) and -0.08183666563172187 = (23790894.94484017 - 25737862.459520478) / 23790894.94484017 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked as regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added CoreClr untriaged New issue has not been triaged by the area owner labels May 3, 2022
@DrewScoggins DrewScoggins transferred this issue from dotnet/perf-autofiling-issues May 3, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 3, 2022
@ghost
Copy link

ghost commented May 3, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 8006e6a89bc02e410331e6323e3f6321b224b327
Compare e4163ea55ebb3673c29e1c2a850a6a790029d278
Diff Diff

Regressions in BenchmarksGame.FannkuchRedux_5

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
RunBench - Duration of single invocation 23.95 ms 26.24 ms 1.10 0.01 True

Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'BenchmarksGame.FannkuchRedux_5*'

Payloads

Baseline
Compare

Histogram

BenchmarksGame.FannkuchRedux_5.RunBench(n: 10, expectedSum: 38)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 26.23831273571429 > 25.0471650415.
IsChangePoint: Marked as a change because one of 4/26/2022 12:44:27 PM, 5/3/2022 11:42:39 AM falls between 4/24/2022 10:52:31 PM and 5/3/2022 11:42:39 AM.
IsRegressionStdDev: Marked as regression because -26.310636692824072 (T) = (0 -25737862.459520478) / Math.Sqrt((86587725068.70877 / (27)) + (81681730737.5921 / (36))) is less than -1.9996235849941724 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (36) - 2, .025) and -0.08183666563172187 = (23790894.94484017 - 25737862.459520478) / 23790894.94484017 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked as regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: DrewScoggins
Labels:

area-CodeGen-coreclr, untriaged, refs/heads/main, ubuntu 18.04, RunKind=micro, Regression, CoreClr, x64

Milestone: -

@DrewScoggins
Copy link
Member

Seems related to #67930

@DrewScoggins DrewScoggins changed the title [Perf] Changes at 4/26/2022 4:04:55 PM [Perf] Regressions in BenchmarksGame.FannkuchRedux_5 May 3, 2022
@DrewScoggins DrewScoggins added tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark labels May 3, 2022
@DrewScoggins
Copy link
Member

Maybe related dotnet/perf-autofiling-issues#5141

@DrewScoggins
Copy link
Member

Possible improvements: dotnet/perf-autofiling-issues#5072

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label May 5, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone May 5, 2022
@kunalspathak
Copy link
Member

After #67930, we do clone more loops within which we have more cloned loops. I will investigate more and see if we can modify the heuristics for such cases.

@kunalspathak
Copy link
Member

68822.zip

@kunalspathak
Copy link
Member

dumps: 68822_dumps.zip

@kunalspathak
Copy link
Member

At first, looking at the before vs. after diff in profiling, I see same region of code being hot and that led me thinking that it would have disturbed the code locality leading to the regression.

image

However, when I see other generated code before/after, I see some strange sequences:

  1. There are repetitive comparison test r11d, r11d that is taking us to IG18 meaning IG18 ~ IG23 blocks are dead.

image

  1. As a side-effect of that (I assume) we sprinkle the range checks in between the hot code and that might have introduced the slowness.

image

Full diffs are 68822.zip

I will continue investigating.

@kunalspathak
Copy link
Member

There are repetitive comparison test r11d, r11d that is taking us to IG18 meaning IG18 ~ IG23 blocks are dead.

This happens because today, redundant branch optimizations differentiate similar operators like <= and < and doesn't know if <= happens, < is implied and can be eliminated. It becomes tricky to get it correct for many cases. https://godbolt.org/z/4cqnvb8cY vs. SharpLab. Opened #72509 to track it.

we sprinkle the range checks in between the hot code and that might have introduced the slowness.

This is happening because we are hitting limit on number of assertions we are creating after loop cloning. See details in #10591 and #10592.

As such, there is nothing much we can do at this point and since we already have issues for them, I will close this issue.

@ghost ghost locked as resolved and limited conversation to collaborators Aug 19, 2022
@jeffhandley jeffhandley added the runtime-coreclr specific to the CoreCLR runtime label Dec 28, 2022
@jeffhandley jeffhandley added os-linux Linux OS (any supported distro) arch-x64 and removed CoreClr labels Dec 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

4 participants