Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regressions in System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateUsingIndexer and Hashset.ContainsTrue<Int32> #67101

Closed
performanceautofiler bot opened this issue Mar 24, 2022 · 14 comments
Assignees
Labels
area-VM-coreclr runtime-coreclr specific to the CoreCLR runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Mar 24, 2022

`### Run Information

Architecture arm64
OS Windows 10.0.19041
Baseline ac84ea6e241ad2d2cde346144b2c4b3a5d64fa1d
Compare 0b12d37843e7165fb4c8b794186f19ef43af6c73
Diff Diff
tighter diff a1bc0f3...5371203

Regressions in System.Text.Json.Document.Tests.Perf_EnumerateArray

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
EnumerateUsingIndexer - Duration of single invocation 1.74 μs 1.85 μs 1.06 0.01 False
EnumerateArray- Duration of single invocation moved to https://github.com/dotnet/runtime/issues/67176 2.29 μs 2.96 μs 1.29 0.01 False

graph

Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.Json.Document.Tests.Perf_EnumerateArray*'

Payloads

Baseline
Compare

Histogram

System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateUsingIndexer(TestCase: ArrayOfStrings)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 1.847930218805441 > 1.826793768692464.
IsChangePoint: Marked as a change because one of 12/2/2021 4:22:41 PM, 1/28/2022 4:34:00 PM, 3/17/2022 6:54:53 PM, 3/22/2022 12:26:23 PM falls between 3/13/2022 11:50:20 AM and 3/22/2022 12:26:23 PM.
IsRegressionStdDev: Marked as regression because -37.567265158486116 (T) = (0 -1844.6392731408614) / Math.Sqrt((179.10677767316412 / (27)) + (4.721594038302771 / (5))) is less than -2.0422724562973107 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (5) - 2, .025) and -0.05939220196144539 = (1741.2241375059637 - 1844.6392731408614) / 1741.2241375059637 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateArray(TestCase: Json400KB)

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 2.9598147370059142 > 2.4018696254624405.
IsChangePoint: Marked as a change because one of 1/28/2022 4:34:00 PM, 3/15/2022 12:53:39 AM, 3/22/2022 12:26:23 PM falls between 3/13/2022 11:50:20 AM and 3/22/2022 12:26:23 PM.
IsRegressionStdDev: Marked as regression because -120.6428218804647 (T) = (0 -2935.2304895714733) / Math.Sqrt((19.852319641720772 / (16)) + (437.8173014793286 / (16))) is less than -2.0422724562973107 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (16) + (16) - 2, .025) and -0.2817622489671951 = (2289.996051870456 - 2935.2304895714733) / 2289.996051870456 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture arm64
OS Windows 10.0.19041
Baseline ac84ea6e241ad2d2cde346144b2c4b3a5d64fa1d
Compare 0b12d37843e7165fb4c8b794186f19ef43af6c73
Diff Diff

Regressions in System.Collections.ContainsTrue<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
HashSet - Duration of single invocation 4.99 μs 6.01 μs 1.20 0.05 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.ContainsTrue&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.ContainsTrue<Int32>.HashSet(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 6.013932003309152 > 5.256438464261174.
IsChangePoint: Marked as a change because one of 12/7/2021 3:27:58 PM, 12/8/2021 11:18:41 PM, 12/27/2021 6:47:12 AM, 3/17/2022 6:54:53 PM, 3/22/2022 12:26:23 PM falls between 3/13/2022 11:50:20 AM and 3/22/2022 12:26:23 PM.
IsRegressionStdDev: Marked as regression because -19.80322266903542 (T) = (0 -6053.083872496654) / Math.Sqrt((1148.5014152204074 / (27)) + (13486.517429206551 / (5))) is less than -2.0422724562973107 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (5) - 2, .025) and -0.2066314747169855 = (5016.51415475997 - 6053.083872496654) / 5016.51415475997 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added arm64 PGO untriaged New issue has not been triaged by the area owner labels Mar 24, 2022
@DrewScoggins DrewScoggins removed arm64 untriaged New issue has not been triaged by the area owner labels Mar 24, 2022
@DrewScoggins DrewScoggins changed the title [Perf] Changes at 3/18/2022 1:59:24 AM Regressions in System.Text.Json.Document.Tests.Perf_EnumerateArray Mar 24, 2022
@DrewScoggins DrewScoggins transferred this issue from dotnet/perf-autofiling-issues Mar 24, 2022
@DrewScoggins DrewScoggins added tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark and removed PGO labels Mar 24, 2022
@DrewScoggins
Copy link
Member

DrewScoggins commented Mar 24, 2022

LIkely #65738

@DrewScoggins
Copy link
Member

DrewScoggins commented Mar 24, 2022

Related dotnet/perf-autofiling-issues#4255

@ghost
Copy link

ghost commented Mar 25, 2022

Tagging subscribers to this area: @dotnet/area-system-text-json
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture arm64
OS Windows 10.0.19041
Baseline ac84ea6e241ad2d2cde346144b2c4b3a5d64fa1d
Compare 0b12d37843e7165fb4c8b794186f19ef43af6c73
Diff Diff

Regressions in System.Text.Json.Document.Tests.Perf_EnumerateArray

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
EnumerateUsingIndexer - Duration of single invocation 1.74 μs 1.85 μs 1.06 0.01 False
EnumerateArray - Duration of single invocation 2.29 μs 2.96 μs 1.29 0.01 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.Json.Document.Tests.Perf_EnumerateArray*'

Payloads

Baseline
Compare

Histogram

System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateUsingIndexer(TestCase: ArrayOfStrings)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 1.847930218805441 > 1.826793768692464.
IsChangePoint: Marked as a change because one of 12/2/2021 4:22:41 PM, 1/28/2022 4:34:00 PM, 3/17/2022 6:54:53 PM, 3/22/2022 12:26:23 PM falls between 3/13/2022 11:50:20 AM and 3/22/2022 12:26:23 PM.
IsRegressionStdDev: Marked as regression because -37.567265158486116 (T) = (0 -1844.6392731408614) / Math.Sqrt((179.10677767316412 / (27)) + (4.721594038302771 / (5))) is less than -2.0422724562973107 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (5) - 2, .025) and -0.05939220196144539 = (1741.2241375059637 - 1844.6392731408614) / 1741.2241375059637 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateArray(TestCase: Json400KB)

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 2.9598147370059142 > 2.4018696254624405.
IsChangePoint: Marked as a change because one of 1/28/2022 4:34:00 PM, 3/15/2022 12:53:39 AM, 3/22/2022 12:26:23 PM falls between 3/13/2022 11:50:20 AM and 3/22/2022 12:26:23 PM.
IsRegressionStdDev: Marked as regression because -120.6428218804647 (T) = (0 -2935.2304895714733) / Math.Sqrt((19.852319641720772 / (16)) + (437.8173014793286 / (16))) is less than -2.0422724562973107 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (16) + (16) - 2, .025) and -0.2817622489671951 = (2289.996051870456 - 2935.2304895714733) / 2289.996051870456 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture arm64
OS Windows 10.0.19041
Baseline ac84ea6e241ad2d2cde346144b2c4b3a5d64fa1d
Compare 0b12d37843e7165fb4c8b794186f19ef43af6c73
Diff Diff

Regressions in System.Collections.ContainsTrue<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
HashSet - Duration of single invocation 4.99 μs 6.01 μs 1.20 0.05 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.ContainsTrue&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.ContainsTrue<Int32>.HashSet(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 6.013932003309152 > 5.256438464261174.
IsChangePoint: Marked as a change because one of 12/7/2021 3:27:58 PM, 12/8/2021 11:18:41 PM, 12/27/2021 6:47:12 AM, 3/17/2022 6:54:53 PM, 3/22/2022 12:26:23 PM falls between 3/13/2022 11:50:20 AM and 3/22/2022 12:26:23 PM.
IsRegressionStdDev: Marked as regression because -19.80322266903542 (T) = (0 -6053.083872496654) / Math.Sqrt((1148.5014152204074 / (27)) + (13486.517429206551 / (5))) is less than -2.0422724562973107 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (5) - 2, .025) and -0.2066314747169855 = (5016.51415475997 - 6053.083872496654) / 5016.51415475997 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: -
Labels:

area-System.Text.Json, tenet-performance, tenet-performance-benchmarks, refs/heads/main, RunKind=micro, Windows 10.0.19041, Regression, CoreClr

Milestone: -

@danmoseley
Copy link
Member

I see no plausible libraries changes in this interval. Moving to VM per suggestion above, but it could be something else.

@janvorli
Copy link
Member

The System.Text.Json.Document.Tests.Perf_EnumerateArray regression isn't from the #65738, it has occurred two days before that change went in.

@danmoseley
Copy link
Member

Ah yes, Hashset and Enumerate using indexer commit range -- a1bc0f3...5371203 (includes #66456, which might be totally innocent still)

enumerate array commit range: 6bf873a...bc5e386. The plausible candidate there is (#66618 (cc @AndyAyersMS ?)

@DrewScoggins it seems the 3 regressions in this issue occurred across 2 different intervals so ought to have 2 issues. analysis script glitch maybe?

@danmoseley
Copy link
Member

I also notice that the commit range above (eg., the first one) is 17 commits, but from the graph, I think it should span only 8 commits.

@DrewScoggins
Copy link
Member

Yeah, in general the auto-filer can do a good job of putting changes together that happened at the same time. Sometimes however, the actual point that the changepoint algorithm picks is several points off from where the actual changepoint is. This is an underlying weakness inherent in the algorithm. Normally, when we do the triage we try and make sure that only issues that match together end up in the same place, but this one slipped through.

The second piece is that because of this underlying weakness in the algorithm I make the diff link a commit earlier and later then the tool discovers to make sure that we increase the odds of the offending commit being in that range.

@danmoseley
Copy link
Member

That's interesting. I'm curious what algorithm you use, etc-- is there a repo somewhere? This one on the face of it seems like an "easy case". (I know it's a hard problem and don't pretend to be able to do better)
image

OK I'll break the 2nd one out into its own issue.

@danmoseley danmoseley changed the title Regressions in System.Text.Json.Document.Tests.Perf_EnumerateArray Regressions in System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateUsingIndexer and Hashset.ContainsTrue<Int32> Mar 25, 2022
@DrewScoggins
Copy link
Member

It is checked into an internal repo, but I will put the relevant code here. You will also want to take a look at this package, ruptures, as it is the one we use for the changepoint analysis. You will also find linked there a good paper on changepoint analysis as a field of study. I used that one pretty extensively when I was first building this.

import numpy as np
import ruptures as rpt
import sys

data = open(sys.argv[1], ""r"").read()
points = data.split(',')

for i in range(0, len(points)):
    points[i] = float (points[i])

points = np.concatenate([points])

algo = rpt.Pelt(model=""mahalanobis"", jump=1, min_size=3).fit(points)
results = algo.predict(pen=4*np.log(len(points)))
print(results)

@EgorBo
Copy link
Member

EgorBo commented Mar 29, 2022

Perf_Version.TryFormat2 regression on windows-x64

@mangod9 mangod9 added this to the 7.0.0 milestone Jul 6, 2022
@mangod9
Copy link
Member

mangod9 commented Jul 27, 2022

is this still an issue? from @janvorli comment this is not related to VM changes.

@mangod9
Copy link
Member

mangod9 commented Aug 11, 2022

@DrewScoggins is this issue actionable? Based on the charts this looks like its within the range of what things were in 6?

@mangod9
Copy link
Member

mangod9 commented Aug 18, 2022

I am closing this as not-actionable at the moment.

@mangod9 mangod9 closed this as completed Aug 18, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Sep 18, 2022
@jeffhandley jeffhandley added runtime-coreclr specific to the CoreCLR runtime and removed CoreClr labels Dec 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM-coreclr runtime-coreclr specific to the CoreCLR runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

7 participants