Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Windows/x64: 13 Improvements on 2/2/2023 8:18:56 AM #12471

Closed
performanceautofiler bot opened this issue Feb 7, 2023 · 2 comments
Closed

[Perf] Windows/x64: 13 Improvements on 2/2/2023 8:18:56 AM #12471

performanceautofiler bot opened this issue Feb 7, 2023 · 2 comments
Assignees
Labels
arch-x64 branch-refs/heads/main kind-micro os-windows perf-improvement PGO Applied if there were any profile guided optimization updates in the observed interval. runtime-coreclr

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Feb 7, 2023

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 2ba2396495c22429035d165e478672c442f81e22
Compare 6aa9f8b5a5d7ea4d79715f0b16f2a5b0ab6ac48d
Diff Diff

Improvements in System.Numerics.Tests.Perf_Matrix4x4

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Transpose - Duration of single invocation 8.31 ns 0.93 ns 0.11 0.02 False Trace Trace
CreateShadowBenchmark - Duration of single invocation 12.48 ns 8.24 ns 0.66 0.02 False 72.44244789047403 56.6111495900296 0.7814637859231397 Trace Trace
CreateReflectionBenchmark - Duration of single invocation 9.86 ns 6.36 ns 0.64 0.01 False Trace Trace

graph
graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Matrix4x4*'

Payloads

Baseline
Compare

Histogram

System.Numerics.Tests.Perf_Matrix4x4.Transpose


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.9347888896656968 < 8.038721390715988.
IsChangePoint: Marked as a change because one of 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 256.86794442176654 (T) = (0 -0.9551903415440404) / Math.Sqrt((0.2675584877279325 / (299)) + (0.0014991678268240927 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.8925836926808375 = (8.892414619187331 - 0.9551903415440404) / 8.892414619187331 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Matrix4x4.Transpose()
       push      rsi
       sub       rsp,0A0
       vzeroupper
       mov       rsi,rdx
       vmovups   xmm0,[7FFE7D925060]
       vmovups   [rsp+60],xmm0
       vmovups   xmm0,[7FFE7D925070]
       vmovups   [rsp+70],xmm0
       vmovups   xmm0,[7FFE7D925080]
       vmovups   [rsp+80],xmm0
       vmovups   xmm0,[7FFE7D925090]
       vmovups   [rsp+90],xmm0
       lea       rcx,[rsp+20]
       lea       rdx,[rsp+60]
       call      qword ptr [7FFE7DE5BB10]; System.Numerics.Matrix4x4+Impl.Transpose(Impl ByRef)
       vmovdqu   ymm0,ymmword ptr [rsp+20]
       vmovdqu   ymmword ptr [rsi],ymm0
       vmovdqu   ymm0,ymmword ptr [rsp+40]
       vmovdqu   ymmword ptr [rsi+20],ymm0
       mov       rax,rsi
       add       rsp,0A0
       pop       rsi
       ret
; Total bytes of code 125
; System.Numerics.Matrix4x4+Impl.Transpose(Impl ByRef)
       vzeroupper
       vmovups   xmm0,[rdx]
       vmovups   xmm1,[rdx+10]
       vmovups   xmm2,[rdx+20]
       vmovups   xmm3,[rdx+30]
       vunpcklps xmm4,xmm0,xmm2
       vunpcklps xmm5,xmm1,xmm3
       vunpckhps xmm0,xmm0,xmm2
       vunpckhps xmm1,xmm1,xmm3
       vunpcklps xmm2,xmm4,xmm5
       vunpckhps xmm3,xmm4,xmm5
       vunpcklps xmm4,xmm0,xmm1
       vunpckhps xmm0,xmm0,xmm1
       vmovups   [rcx],xmm2
       vmovups   [rcx+10],xmm3
       vmovups   [rcx+20],xmm4
       vmovups   [rcx+30],xmm0
       mov       rax,rcx
       ret
; Total bytes of code 77

Compare Jit Disasm

; System.Numerics.Tests.Perf_Matrix4x4.Transpose()
       vzeroupper
       vmovups   xmm0,[7FFBBF155050]
       vunpcklps xmm0,xmm0,[7FFBBF155060]
       vmovups   xmm1,[7FFBBF155070]
       vunpcklps xmm1,xmm1,[7FFBBF155080]
       vmovups   xmm2,[7FFBBF155050]
       vunpckhps xmm2,xmm2,[7FFBBF155060]
       vmovups   xmm3,[7FFBBF155070]
       vunpckhps xmm3,xmm3,[7FFBBF155080]
       vunpcklps xmm4,xmm0,xmm1
       vunpckhps xmm0,xmm0,xmm1
       vunpcklps xmm1,xmm2,xmm3
       vunpckhps xmm2,xmm2,xmm3
       vmovups   [rdx],xmm4
       vmovups   [rdx+10],xmm0
       vmovups   [rdx+20],xmm1
       vmovups   [rdx+30],xmm2
       mov       rax,rdx
       ret
; Total bytes of code 106

System.Numerics.Tests.Perf_Matrix4x4.CreateShadowBenchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 8.237259923259558 < 11.875804438553851.
IsChangePoint: Marked as a change because one of 1/6/2023 6:58:01 PM, 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 55.94318276978657 (T) = (0 -8.257289947525265) / Math.Sqrt((3.706729669042814 / (299)) + (0.0006987871554295211 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.4302624182560546 = (14.493145988807708 - 8.257289947525265) / 14.493145988807708 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Matrix4x4.CreateShadowBenchmark()
       sub       rsp,18
       vzeroupper
       vmovaps   [rsp],xmm6
       vxorps    xmm0,xmm0,xmm0
       vxorps    xmm1,xmm1,xmm1
       vdpps     xmm0,xmm0,xmm1,71
       vsubss    xmm1,xmm0,dword ptr [7FFEB55D5160]
       vandps    xmm1,xmm1,[7FFEB55D5170]
       vmovss    xmm2,dword ptr [7FFEB55D5180]
       vucomiss  xmm2,xmm1
       jbe       short M00_L00
       vxorps    xmm1,xmm1,xmm1
       vxorps    xmm2,xmm2,xmm2
       jmp       short M00_L01
M00_L00:
       vsqrtss   xmm0,xmm0,xmm0
       vmovaps   xmm1,xmm0
       vbroadcastss xmm2,xmm1
       vxorps    xmm1,xmm1,xmm1
       vdivps    xmm1,xmm1,xmm2
       vxorps    xmm2,xmm2,xmm2
       vdivss    xmm2,xmm2,xmm0
M00_L01:
       vmovups   xmm0,[7FFEB55D5190]
       vdpps     xmm0,xmm0,xmm1,71
       vxorps    xmm3,xmm3,xmm3
       vsubps    xmm1,xmm3,xmm1
       vmovaps   xmm3,xmm1
       vbroadcastss xmm3,xmm3
       vmovups   xmm4,[7FFEB55D5190]
       vmulps    xmm3,xmm3,xmm4
       vmovaps   xmm4,xmm3
       vmovshdup xmm5,xmm3
       vinsertps xmm4,xmm4,xmm5,10
       vunpckhps xmm3,xmm3,xmm3
       vinsertps xmm3,xmm4,xmm3,28
       vmovaps   xmm4,xmm0
       vinsertps xmm4,xmm4,xmm4,3E
       vaddps    xmm3,xmm3,xmm4
       vmovshdup xmm4,xmm1
       vbroadcastss xmm4,xmm4
       vmovups   xmm5,[7FFEB55D5190]
       vmulps    xmm4,xmm4,xmm5
       vmovaps   xmm5,xmm4
       vmovshdup xmm6,xmm4
       vinsertps xmm5,xmm5,xmm6,10
       vunpckhps xmm4,xmm4,xmm4
       vinsertps xmm4,xmm5,xmm4,28
       vinsertps xmm5,xmm5,xmm0,1D
       vaddps    xmm4,xmm4,xmm5
       vunpckhps xmm1,xmm1,xmm1
       vbroadcastss xmm1,xmm1
       vmovups   xmm5,[7FFEB55D5190]
       vmulps    xmm1,xmm1,xmm5
       vmovaps   xmm5,xmm1
       vmovshdup xmm6,xmm1
       vinsertps xmm5,xmm5,xmm6,10
       vunpckhps xmm1,xmm1,xmm1
       vinsertps xmm1,xmm5,xmm1,28
       vinsertps xmm5,xmm5,xmm0,2B
       vaddps    xmm1,xmm1,xmm5
       vxorps    xmm2,xmm2,[7FFEB55D51A0]
       vbroadcastss xmm2,xmm2
       vmovups   xmm5,[7FFEB55D5190]
       vmulps    xmm2,xmm2,xmm5
       vmovaps   xmm5,xmm2
       vmovshdup xmm6,xmm2
       vinsertps xmm5,xmm5,xmm6,10
       vunpckhps xmm2,xmm2,xmm2
       vinsertps xmm2,xmm5,xmm2,20
       vinsertps xmm0,xmm2,xmm0,30
       vmovups   [rdx],xmm3
       vmovups   [rdx+10],xmm4
       vmovups   [rdx+20],xmm1
       vmovups   [rdx+30],xmm0
       mov       rax,rdx
       vmovaps   xmm6,[rsp]
       add       rsp,18
       ret
; Total bytes of code 373

Compare Jit Disasm

; System.Numerics.Tests.Perf_Matrix4x4.CreateShadowBenchmark()
       sub       rsp,18
       vzeroupper
       vmovaps   [rsp],xmm6
       vxorps    xmm0,xmm0,xmm0
       vdpps     xmm0,xmm0,xmm0,7F
       vsubss    xmm1,xmm0,dword ptr [7FF9AB2D5130]
       vandps    xmm1,xmm1,[7FF9AB2D5140]
       vmovss    xmm2,dword ptr [7FF9AB2D5150]
       vucomiss  xmm2,xmm1
       jbe       short M00_L00
       vxorps    xmm1,xmm1,xmm1
       jmp       short M00_L01
M00_L00:
       vsqrtss   xmm0,xmm0,xmm0
       vxorps    xmm1,xmm1,xmm1
       vmovaps   xmm2,xmm0
       vbroadcastss xmm2,xmm2
       vdivps    xmm1,xmm1,xmm2
       vxorps    xmm2,xmm2,xmm2
       vshufps   xmm2,xmm2,xmm2,0FF
       vdivss    xmm0,xmm2,xmm0
       vinsertps xmm1,xmm1,xmm0,30
M00_L01:
       vmovaps   xmm0,xmm1
       vmovups   xmm2,[7FF9AB2D5160]
       vdpps     xmm2,xmm2,xmm0,7F
       vxorps    xmm3,xmm3,xmm3
       vsubps    xmm0,xmm3,xmm0
       vmovaps   xmm3,xmm2
       vinsertps xmm3,xmm3,xmm3,3E
       vmovaps   xmm4,xmm0
       vbroadcastss xmm4,xmm4
       vmovups   xmm5,[7FF9AB2D5160]
       vmulps    xmm4,xmm4,xmm5
       vinsertps xmm4,xmm4,xmm4,38
       vaddps    xmm3,xmm4,xmm3
       vinsertps xmm4,xmm4,xmm2,1D
       vmovshdup xmm5,xmm0
       vbroadcastss xmm5,xmm5
       vmovups   xmm6,[7FF9AB2D5160]
       vmulps    xmm5,xmm5,xmm6
       vinsertps xmm5,xmm5,xmm5,38
       vaddps    xmm4,xmm5,xmm4
       vinsertps xmm5,xmm5,xmm2,2B
       vunpckhps xmm0,xmm0,xmm0
       vbroadcastss xmm0,xmm0
       vmulps    xmm0,xmm0,xmm6
       vinsertps xmm0,xmm0,xmm0,38
       vaddps    xmm0,xmm0,xmm5
       vshufps   xmm1,xmm1,xmm1,0FF
       vxorps    xmm1,xmm1,[7FF9AB2D5170]
       vbroadcastss xmm1,xmm1
       vmulps    xmm1,xmm1,xmm6
       vinsertps xmm1,xmm1,xmm2,30
       vmovups   [rdx],xmm3
       vmovups   [rdx+10],xmm4
       vmovups   [rdx+20],xmm0
       vmovups   [rdx+30],xmm1
       mov       rax,rdx
       vmovaps   xmm6,[rsp]
       add       rsp,18
       ret
; Total bytes of code 291

System.Numerics.Tests.Perf_Matrix4x4.CreateReflectionBenchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 6.358666014217257 < 9.341016708721801.
IsChangePoint: Marked as a change because one of 1/6/2023 6:58:01 PM, 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 41.395542193834125 (T) = (0 -6.222906215021571) / Math.Sqrt((6.833476814382739 / (299)) + (0.005188679560673321 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.5025378064590694 = (12.509304819180311 - 6.222906215021571) / 12.509304819180311 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Matrix4x4.CreateReflectionBenchmark()
       sub       rsp,28
       vzeroupper
       vmovaps   [rsp+10],xmm6
       vmovaps   [rsp],xmm7
       vxorps    xmm0,xmm0,xmm0
       vxorps    xmm1,xmm1,xmm1
       vdpps     xmm0,xmm0,xmm1,71
       vmovss    xmm1,dword ptr [7FFD8D415110]
       vsubss    xmm2,xmm0,xmm1
       vandps    xmm2,xmm2,[7FFD8D415120]
       vmovss    xmm3,dword ptr [7FFD8D415130]
       vucomiss  xmm3,xmm2
       jbe       short M00_L00
       vxorps    xmm2,xmm2,xmm2
       vxorps    xmm3,xmm3,xmm3
       jmp       short M00_L01
M00_L00:
       vsqrtss   xmm0,xmm0,xmm0
       vmovaps   xmm2,xmm0
       vbroadcastss xmm3,xmm2
       vxorps    xmm2,xmm2,xmm2
       vdivps    xmm2,xmm2,xmm3
       vxorps    xmm3,xmm3,xmm3
       vdivss    xmm3,xmm3,xmm0
M00_L01:
       vmovups   xmm0,[7FFD8D415140]
       vmulps    xmm0,xmm2,xmm0
       vmovaps   xmm4,xmm2
       vbroadcastss xmm4,xmm4
       vmulps    xmm4,xmm0,xmm4
       vmovaps   xmm5,xmm4
       vmovshdup xmm6,xmm4
       vinsertps xmm5,xmm5,xmm6,10
       vunpckhps xmm4,xmm4,xmm4
       vinsertps xmm4,xmm5,xmm4,28
       vaddps    xmm4,xmm4,[7FFD8D415150]
       vmovshdup xmm5,xmm2
       vbroadcastss xmm5,xmm5
       vmulps    xmm5,xmm0,xmm5
       vmovaps   xmm6,xmm5
       vmovshdup xmm7,xmm5
       vinsertps xmm6,xmm6,xmm7,10
       vunpckhps xmm5,xmm5,xmm5
       vinsertps xmm5,xmm6,xmm5,28
       vaddps    xmm5,xmm5,[7FFD8D415160]
       vunpckhps xmm2,xmm2,xmm2
       vbroadcastss xmm2,xmm2
       vmulps    xmm2,xmm0,xmm2
       vmovaps   xmm6,xmm2
       vmovshdup xmm7,xmm2
       vinsertps xmm6,xmm6,xmm7,10
       vunpckhps xmm2,xmm2,xmm2
       vinsertps xmm2,xmm6,xmm2,28
       vaddps    xmm2,xmm2,[7FFD8D415170]
       vbroadcastss xmm3,xmm3
       vmulps    xmm0,xmm0,xmm3
       vmovaps   xmm3,xmm0
       vmovshdup xmm6,xmm0
       vinsertps xmm3,xmm3,xmm6,10
       vunpckhps xmm0,xmm0,xmm0
       vinsertps xmm0,xmm3,xmm0,20
       vinsertps xmm0,xmm0,xmm1,30
       vmovups   [rdx],xmm4
       vmovups   [rdx+10],xmm5
       vmovups   [rdx+20],xmm2
       vmovups   [rdx+30],xmm0
       mov       rax,rdx
       vmovaps   xmm6,[rsp+10]
       vmovaps   xmm7,[rsp]
       add       rsp,28
       ret
; Total bytes of code 329

Compare Jit Disasm

; System.Numerics.Tests.Perf_Matrix4x4.CreateReflectionBenchmark()
       vzeroupper
       vxorps    xmm0,xmm0,xmm0
       vdpps     xmm0,xmm0,xmm0,7F
       vmovss    xmm1,dword ptr [7FFBBF1750B0]
       vsubss    xmm2,xmm0,xmm1
       vandps    xmm2,xmm2,[7FFBBF1750C0]
       vmovss    xmm3,dword ptr [7FFBBF1750D0]
       vucomiss  xmm3,xmm2
       jbe       short M00_L00
       vxorps    xmm2,xmm2,xmm2
       jmp       short M00_L01
M00_L00:
       vsqrtss   xmm0,xmm0,xmm0
       vxorps    xmm2,xmm2,xmm2
       vmovaps   xmm3,xmm0
       vbroadcastss xmm3,xmm3
       vdivps    xmm2,xmm2,xmm3
       vxorps    xmm3,xmm3,xmm3
       vshufps   xmm3,xmm3,xmm3,0FF
       vdivss    xmm0,xmm3,xmm0
       vinsertps xmm2,xmm2,xmm0,30
M00_L01:
       vmovaps   xmm0,xmm2
       vmovups   xmm3,[7FFBBF1750E0]
       vmulps    xmm0,xmm0,xmm3
       vmovaps   xmm3,xmm2
       vbroadcastss xmm3,xmm3
       vmulps    xmm3,xmm0,xmm3
       vinsertps xmm3,xmm3,xmm3,38
       vaddps    xmm3,xmm3,[7FFBBF1750F0]
       vmovshdup xmm4,xmm2
       vbroadcastss xmm4,xmm4
       vmulps    xmm4,xmm0,xmm4
       vinsertps xmm4,xmm4,xmm4,38
       vaddps    xmm4,xmm4,[7FFBBF175100]
       vunpckhps xmm5,xmm2,xmm2
       vbroadcastss xmm5,xmm5
       vmulps    xmm5,xmm0,xmm5
       vinsertps xmm5,xmm5,xmm5,38
       vaddps    xmm5,xmm5,[7FFBBF175110]
       vshufps   xmm2,xmm2,xmm2,0FF
       vbroadcastss xmm2,xmm2
       vmulps    xmm0,xmm0,xmm2
       vinsertps xmm0,xmm0,xmm1,30
       vmovups   [rdx],xmm3
       vmovups   [rdx+10],xmm4
       vmovups   [rdx+20],xmm5
       vmovups   [rdx+30],xmm0
       mov       rax,rdx
       ret
; Total bytes of code 233

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS Windows 10.0.18362
Baseline 2ba2396495c22429035d165e478672c442f81e22
Compare 6aa9f8b5a5d7ea4d79715f0b16f2a5b0ab6ac48d
Diff Diff

Improvements in System.Numerics.Tests.Perf_Quaternion

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
NormalizeBenchmark - Duration of single invocation 3.51 ns 0.25 ns 0.07 0.02 False Trace Trace
InverseBenchmark - Duration of single invocation 3.14 ns 0.35 ns 0.11 0.05 False Trace Trace
EqualsBenchmark - Duration of single invocation 11.43 ns 0.16 ns 0.01 0.00 False 27.508240797654345 10.858697839445696 0.3947434486749014 Trace Trace
CreateFromVector3WithScalarBenchmark - Duration of single invocation 6.41 ns 0.08 ns 0.01 0.34 False 11.652465422565587 3.875894123736579 0.332624383182525 Trace Trace

graph
graph
graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Quaternion*'

Payloads

Baseline
Compare

Histogram

System.Numerics.Tests.Perf_Quaternion.NormalizeBenchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.25201958323700524 < 3.3371701833057115.
IsChangePoint: Marked as a change because one of 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 357.99644834398947 (T) = (0 -0.2467978248272763) / Math.Sqrt((0.0005024953943396132 / (299)) + (0.0020390596631429596 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.9297486951049936 = (3.513071041116261 - 0.2467978248272763) / 3.513071041116261 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Quaternion.NormalizeBenchmark()
       push      rsi
       sub       rsp,30
       mov       rsi,rdx
       xor       ecx,ecx
       mov       [rsp+20],ecx
       mov       [rsp+24],ecx
       mov       [rsp+28],ecx
       mov       dword ptr [rsp+2C],3F800000
       mov       rcx,rsi
       lea       rdx,[rsp+20]
       call      qword ptr [7FFB80781D50]; System.Numerics.Quaternion.Normalize(System.Numerics.Quaternion)
       mov       rax,rsi
       add       rsp,30
       pop       rsi
       ret
; Total bytes of code 53
; System.Numerics.Quaternion.Normalize(System.Numerics.Quaternion)
       vzeroupper
       vmovss    xmm0,dword ptr [rdx]
       vmovss    xmm1,dword ptr [rdx+4]
       vmovss    xmm2,dword ptr [rdx+8]
       vmovss    xmm3,dword ptr [rdx+0C]
       vmulss    xmm4,xmm0,xmm0
       vmulss    xmm5,xmm1,xmm1
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm2,xmm2
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm3,xmm3
       vaddss    xmm4,xmm4,xmm5
       vsqrtss   xmm4,xmm4,xmm4
       vmovss    xmm5,dword ptr [7FFB803C5070]
       vdivss    xmm4,xmm5,xmm4
       vmulss    xmm0,xmm0,xmm4
       vmulss    xmm1,xmm1,xmm4
       vmulss    xmm2,xmm2,xmm4
       vmulss    xmm3,xmm3,xmm4
       vmovss    dword ptr [rcx],xmm0
       vmovss    dword ptr [rcx+4],xmm1
       vmovss    dword ptr [rcx+8],xmm2
       vmovss    dword ptr [rcx+0C],xmm3
       mov       rax,rcx
       ret
; Total bytes of code 105

Compare Jit Disasm

; System.Numerics.Tests.Perf_Quaternion.NormalizeBenchmark()
       vzeroupper
       vmovups   xmm0,[7FFBB5CE4FD0]
       vdpps     xmm1,xmm0,xmm0,0FF
       vsqrtps   xmm1,xmm1
       vdivps    xmm0,xmm0,xmm1
       vmovups   [rdx],xmm0
       mov       rax,rdx
       ret
; Total bytes of code 33

System.Numerics.Tests.Perf_Quaternion.InverseBenchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.35074227755217724 < 2.9814670775358056.
IsChangePoint: Marked as a change because one of 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 330.5763527724273 (T) = (0 -0.33952873436713127) / Math.Sqrt((0.002006428365616392 / (299)) + (0.0016201582213339046 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.8917019902850114 = (3.1351336489071233 - 0.33952873436713127) / 3.1351336489071233 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Quaternion.InverseBenchmark()
       push      rsi
       sub       rsp,30
       mov       rsi,rdx
       xor       ecx,ecx
       mov       [rsp+20],ecx
       mov       [rsp+24],ecx
       mov       [rsp+28],ecx
       mov       dword ptr [rsp+2C],3F800000
       mov       rcx,rsi
       lea       rdx,[rsp+20]
       call      qword ptr [7FFBBF521CD8]; System.Numerics.Quaternion.Inverse(System.Numerics.Quaternion)
       mov       rax,rsi
       add       rsp,30
       pop       rsi
       ret
; Total bytes of code 53
; System.Numerics.Quaternion.Inverse(System.Numerics.Quaternion)
       vzeroupper
       vmovss    xmm0,dword ptr [rdx]
       vmovss    xmm1,dword ptr [rdx+4]
       vmovss    xmm2,dword ptr [rdx+8]
       vmovss    xmm3,dword ptr [rdx+0C]
       vmulss    xmm4,xmm0,xmm0
       vmulss    xmm5,xmm1,xmm1
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm2,xmm2
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm3,xmm3
       vaddss    xmm4,xmm4,xmm5
       vmovss    xmm5,dword ptr [7FFBBF165080]
       vdivss    xmm4,xmm5,xmm4
       vxorps    xmm0,xmm0,[7FFBBF165090]
       vmulss    xmm0,xmm0,xmm4
       vxorps    xmm1,xmm1,[7FFBBF165090]
       vmulss    xmm1,xmm1,xmm4
       vxorps    xmm2,xmm2,[7FFBBF165090]
       vmulss    xmm2,xmm2,xmm4
       vmulss    xmm3,xmm3,xmm4
       vmovss    dword ptr [rcx],xmm0
       vmovss    dword ptr [rcx+4],xmm1
       vmovss    dword ptr [rcx+8],xmm2
       vmovss    dword ptr [rcx+0C],xmm3
       mov       rax,rcx
       ret
; Total bytes of code 125

Compare Jit Disasm

; System.Numerics.Tests.Perf_Quaternion.InverseBenchmark()
       vzeroupper
       vmovups   xmm0,[7FFE7D914FF0]
       vmulps    xmm1,xmm0,[7FFE7D915000]
       vdpps     xmm0,xmm0,xmm0,0FF
       vdivps    xmm0,xmm1,xmm0
       vmovups   [rdx],xmm0
       mov       rax,rdx
       ret
; Total bytes of code 37

System.Numerics.Tests.Perf_Quaternion.EqualsBenchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.1570510375529828 < 10.578584521460762.
IsChangePoint: Marked as a change because one of 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 1227.9585046071422 (T) = (0 -0.15686719881910274) / Math.Sqrt((0.024475327000570382 / (299)) + (5.423642098135168E-07 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.9860788885304614 = (11.268295578435104 - 0.15686719881910274) / 11.268295578435104 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Quaternion.EqualsBenchmark()
       sub       rsp,28
       vzeroupper
       xor       eax,eax
       mov       [rsp+18],rax
       mov       [rsp+20],rax
       xor       eax,eax
       mov       [rsp+18],eax
       mov       [rsp+1C],eax
       mov       [rsp+20],eax
       mov       dword ptr [rsp+24],3F800000
       mov       [rsp+8],eax
       mov       [rsp+0C],eax
       mov       [rsp+10],eax
       mov       dword ptr [rsp+14],3F800000
       vmovups   xmm0,[rsp+18]
       vmovups   xmm1,[rsp+8]
       vcmpeqps  xmm2,xmm0,xmm0
       vcmpeqps  xmm3,xmm1,xmm1
       vorps     xmm2,xmm2,xmm3
       vpcmpeqd  xmm3,xmm3,xmm3
       vxorps    xmm2,xmm2,xmm3
       vcmpeqps  xmm0,xmm0,xmm1
       vorps     xmm0,xmm0,xmm2
       vpcmpeqd  xmm0,xmm0,xmm3
       vpmovmskb eax,xmm0
       cmp       eax,0FFFF
       sete      al
       movzx     eax,al
       add       rsp,28
       ret
; Total bytes of code 128

Compare Jit Disasm

; System.Numerics.Tests.Perf_Quaternion.EqualsBenchmark()
       vzeroupper
       vmovups   xmm0,[7FFEB55E4EB0]
       vcmpeqps  xmm0,xmm0,[7FFEB55E4EB0]
       vorps     xmm1,xmm0,xmm0
       vpcmpeqd  xmm2,xmm2,xmm2
       vxorps    xmm1,xmm1,xmm2
       vorps     xmm0,xmm0,xmm1
       vpcmpeqd  xmm0,xmm0,xmm2
       vpmovmskb eax,xmm0
       cmp       eax,0FFFF
       sete      al
       movzx     eax,al
       ret
; Total bytes of code 56

System.Numerics.Tests.Perf_Quaternion.CreateFromVector3WithScalarBenchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.07827335215603227 < 6.0910311273094.
IsChangePoint: Marked as a change because one of 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 429.46915789282116 (T) = (0 -0.077868307271839) / Math.Sqrt((0.06848500916732923 / (299)) + (8.658999998626379E-07 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.9881624426035137 = (6.578072203895047 - 0.077868307271839) / 6.578072203895047 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Quaternion.CreateFromVector3WithScalarBenchmark()
       sub       rsp,18
       vzeroupper
       vxorps    xmm0,xmm0,xmm0
       vmovups   [rsp+8],xmm0
       vmovups   xmm0,[7FFDA3A54FF0]
       vmovsd    qword ptr [rsp+8],xmm0
       vextractps dword ptr [rsp+10],xmm0,2
       mov       dword ptr [rsp+14],40800000
       vmovups   xmm0,[rsp+8]
       vmovups   [rdx],xmm0
       mov       rax,rdx
       add       rsp,18
       ret
; Total bytes of code 65

Compare Jit Disasm

; System.Numerics.Tests.Perf_Quaternion.CreateFromVector3WithScalarBenchmark()
       vzeroupper
       vmovups   xmm0,[7FFD14AF4FC0]
       vmovups   [rdx],xmm0
       mov       rax,rdx
       ret
; Total bytes of code 19

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 2ba2396495c22429035d165e478672c442f81e22
Compare 6aa9f8b5a5d7ea4d79715f0b16f2a5b0ab6ac48d
Diff Diff

Improvements in System.Numerics.Tests.Perf_Vector4

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
TransformVector2ByMatrix4x4Benchmark - Duration of single invocation 2.97 ns 0.47 ns 0.16 0.09 False 35.14344381643644 14.8688205480876 0.4230894566210246 Trace Trace
TransformByMatrix4x4Benchmark - Duration of single invocation 3.51 ns 0.80 ns 0.23 0.14 False 43.76280729261251 19.852150202608616 0.45363063822370026 Trace Trace
TransformVector3ByMatrix4x4Benchmark - Duration of single invocation 3.55 ns 0.80 ns 0.23 0.10 False 43.76220141282348 19.84830104837284 0.4535489625198965 Trace Trace

graph
graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Vector4*'

Payloads

Baseline
Compare

Histogram

System.Numerics.Tests.Perf_Vector4.TransformVector2ByMatrix4x4Benchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.46956969714264696 < 2.82334894453912.
IsChangePoint: Marked as a change because one of 1/7/2023 12:45:06 AM, 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 60.649307495920226 (T) = (0 -0.46846269452974737) / Math.Sqrt((1.095176670794964 / (299)) + (3.2559796635305216E-06 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.8868198298537264 = (4.1390880922365465 - 0.46846269452974737) / 4.1390880922365465 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Vector4.TransformVector2ByMatrix4x4Benchmark()
       sub       rsp,48
       vzeroupper
       vmovups   xmm0,[7FFAA7B15390]
       vmovups   [rsp+8],xmm0
       vmovups   xmm0,[7FFAA7B153A0]
       vmovups   [rsp+18],xmm0
       vmovups   xmm0,[7FFAA7B153B0]
       vmovups   [rsp+28],xmm0
       vmovups   xmm0,[7FFAA7B153C0]
       vmovups   [rsp+38],xmm0
       vmovsd    xmm0,qword ptr [7FFAA7B153D0]
       vmovaps   xmm1,xmm0
       vmulss    xmm2,xmm1,dword ptr [rsp+8]
       vmovshdup xmm0,xmm0
       vmulss    xmm3,xmm0,dword ptr [rsp+18]
       vaddss    xmm2,xmm2,xmm3
       vaddss    xmm2,xmm2,dword ptr [rsp+38]
       vmulss    xmm3,xmm1,dword ptr [rsp+0C]
       vmulss    xmm4,xmm0,dword ptr [rsp+1C]
       vaddss    xmm3,xmm3,xmm4
       vaddss    xmm3,xmm3,dword ptr [rsp+3C]
       vinsertps xmm2,xmm2,xmm3,10
       vmulss    xmm3,xmm1,dword ptr [rsp+10]
       vmulss    xmm4,xmm0,dword ptr [rsp+20]
       vaddss    xmm3,xmm3,xmm4
       vaddss    xmm3,xmm3,dword ptr [rsp+40]
       vinsertps xmm2,xmm2,xmm3,20
       vmulss    xmm1,xmm1,dword ptr [rsp+14]
       vmulss    xmm0,xmm0,dword ptr [rsp+24]
       vaddss    xmm0,xmm1,xmm0
       vaddss    xmm0,xmm0,dword ptr [rsp+44]
       vinsertps xmm0,xmm2,xmm0,30
       vmovups   [rdx],xmm0
       mov       rax,rdx
       add       rsp,48
       ret
; Total bytes of code 197

Compare Jit Disasm

; System.Numerics.Tests.Perf_Vector4.TransformVector2ByMatrix4x4Benchmark()
       vzeroupper
       vmovups   xmm0,[7FF99F8052F0]
       vmovups   xmm1,[7FF99F805300]
       vmovups   xmm2,[7FF99F805310]
       vmovsd    xmm3,qword ptr [7FF99F805320]
       vmovaps   xmm4,xmm3
       vbroadcastss xmm4,xmm4
       vmulps    xmm0,xmm0,xmm4
       vmovshdup xmm3,xmm3
       vbroadcastss xmm3,xmm3
       vmulps    xmm1,xmm1,xmm3
       vaddps    xmm0,xmm0,xmm1
       vaddps    xmm0,xmm0,xmm2
       vmovups   [rdx],xmm0
       mov       rax,rdx
       ret
; Total bytes of code 77

System.Numerics.Tests.Perf_Vector4.TransformByMatrix4x4Benchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.7990715436816171 < 3.3793596540183466.
IsChangePoint: Marked as a change because one of 1/7/2023 8:10:22 PM, 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 56.23178027483706 (T) = (0 -0.8002925052989802) / Math.Sqrt((1.780432604929649 / (299)) + (3.154999256857275E-06 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.8442868882005589 = (5.139531899726972 - 0.8002925052989802) / 5.139531899726972 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Vector4.TransformByMatrix4x4Benchmark()
       sub       rsp,48
       vzeroupper
       vmovups   xmm0,[7FF7F45D53C0]
       vmovups   [rsp+8],xmm0
       vmovups   xmm0,[7FF7F45D53D0]
       vmovups   [rsp+18],xmm0
       vmovups   xmm0,[7FF7F45D53E0]
       vmovups   [rsp+28],xmm0
       vmovups   xmm0,[7FF7F45D53F0]
       vmovups   [rsp+38],xmm0
       vmovups   xmm0,[7FF7F45D5400]
       vmovaps   xmm1,xmm0
       vmulss    xmm2,xmm1,dword ptr [rsp+8]
       vmovshdup xmm3,xmm0
       vmulss    xmm4,xmm3,dword ptr [rsp+18]
       vaddss    xmm2,xmm2,xmm4
       vunpckhps xmm0,xmm0,xmm0
       vmulss    xmm4,xmm0,dword ptr [rsp+28]
       vaddss    xmm2,xmm2,xmm4
       vaddss    xmm2,xmm2,dword ptr [rsp+38]
       vmulss    xmm4,xmm1,dword ptr [rsp+0C]
       vmulss    xmm5,xmm3,dword ptr [rsp+1C]
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm0,dword ptr [rsp+2C]
       vaddss    xmm4,xmm4,xmm5
       vaddss    xmm4,xmm4,dword ptr [rsp+3C]
       vinsertps xmm2,xmm2,xmm4,10
       vmulss    xmm4,xmm1,dword ptr [rsp+10]
       vmulss    xmm5,xmm3,dword ptr [rsp+20]
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm0,dword ptr [rsp+30]
       vaddss    xmm4,xmm4,xmm5
       vaddss    xmm4,xmm4,dword ptr [rsp+40]
       vinsertps xmm2,xmm2,xmm4,20
       vmulss    xmm1,xmm1,dword ptr [rsp+14]
       vmulss    xmm3,xmm3,dword ptr [rsp+24]
       vaddss    xmm1,xmm1,xmm3
       vmulss    xmm0,xmm0,dword ptr [rsp+34]
       vaddss    xmm0,xmm1,xmm0
       vaddss    xmm0,xmm0,dword ptr [rsp+44]
       vinsertps xmm0,xmm2,xmm0,30
       vmovups   [rdx],xmm0
       mov       rax,rdx
       add       rsp,48
       ret
; Total bytes of code 241

Compare Jit Disasm

; System.Numerics.Tests.Perf_Vector4.TransformByMatrix4x4Benchmark()
       vzeroupper
       vmovups   xmm0,[7FF7DE0F5310]
       vmovups   xmm1,[7FF7DE0F5320]
       vmovups   xmm2,[7FF7DE0F5330]
       vmovups   xmm3,[7FF7DE0F5340]
       vmovups   xmm4,[7FF7DE0F5350]
       vmovaps   xmm5,xmm4
       vbroadcastss xmm5,xmm5
       vmulps    xmm0,xmm0,xmm5
       vmovshdup xmm5,xmm4
       vbroadcastss xmm5,xmm5
       vmulps    xmm1,xmm1,xmm5
       vaddps    xmm0,xmm0,xmm1
       vunpckhps xmm1,xmm4,xmm4
       vbroadcastss xmm1,xmm1
       vmulps    xmm1,xmm2,xmm1
       vaddps    xmm0,xmm0,xmm1
       vaddps    xmm0,xmm0,xmm3
       vmovups   [rdx],xmm0
       mov       rax,rdx
       ret
; Total bytes of code 102

System.Numerics.Tests.Perf_Vector4.TransformVector3ByMatrix4x4Benchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.8024386075447032 < 3.3542761804117536.
IsChangePoint: Marked as a change because one of 1/6/2023 6:58:01 PM, 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 57.551359628398096 (T) = (0 -0.8000032445235216) / Math.Sqrt((1.6723618037356205 / (299)) + (2.450451790708422E-06 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.8432647212296306 = (5.104168319983625 - 0.8000032445235216) / 5.104168319983625 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Vector4.TransformVector3ByMatrix4x4Benchmark()
       sub       rsp,48
       vzeroupper
       vmovups   xmm0,[7FFDA43B53C0]
       vmovups   [rsp+8],xmm0
       vmovups   xmm0,[7FFDA43B53D0]
       vmovups   [rsp+18],xmm0
       vmovups   xmm0,[7FFDA43B53E0]
       vmovups   [rsp+28],xmm0
       vmovups   xmm0,[7FFDA43B53F0]
       vmovups   [rsp+38],xmm0
       vmovups   xmm0,[7FFDA43B5400]
       vmovaps   xmm1,xmm0
       vmulss    xmm2,xmm1,dword ptr [rsp+8]
       vmovshdup xmm3,xmm0
       vmulss    xmm4,xmm3,dword ptr [rsp+18]
       vaddss    xmm2,xmm2,xmm4
       vunpckhps xmm0,xmm0,xmm0
       vmulss    xmm4,xmm0,dword ptr [rsp+28]
       vaddss    xmm2,xmm2,xmm4
       vaddss    xmm2,xmm2,dword ptr [rsp+38]
       vmulss    xmm4,xmm1,dword ptr [rsp+0C]
       vmulss    xmm5,xmm3,dword ptr [rsp+1C]
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm0,dword ptr [rsp+2C]
       vaddss    xmm4,xmm4,xmm5
       vaddss    xmm4,xmm4,dword ptr [rsp+3C]
       vinsertps xmm2,xmm2,xmm4,10
       vmulss    xmm4,xmm1,dword ptr [rsp+10]
       vmulss    xmm5,xmm3,dword ptr [rsp+20]
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm0,dword ptr [rsp+30]
       vaddss    xmm4,xmm4,xmm5
       vaddss    xmm4,xmm4,dword ptr [rsp+40]
       vinsertps xmm2,xmm2,xmm4,20
       vmulss    xmm1,xmm1,dword ptr [rsp+14]
       vmulss    xmm3,xmm3,dword ptr [rsp+24]
       vaddss    xmm1,xmm1,xmm3
       vmulss    xmm0,xmm0,dword ptr [rsp+34]
       vaddss    xmm0,xmm1,xmm0
       vaddss    xmm0,xmm0,dword ptr [rsp+44]
       vinsertps xmm0,xmm2,xmm0,30
       vmovups   [rdx],xmm0
       mov       rax,rdx
       add       rsp,48
       ret
; Total bytes of code 241

Compare Jit Disasm

; System.Numerics.Tests.Perf_Vector4.TransformVector3ByMatrix4x4Benchmark()
       vzeroupper
       vmovups   xmm0,[7FF8FF205310]
       vmovups   xmm1,[7FF8FF205320]
       vmovups   xmm2,[7FF8FF205330]
       vmovups   xmm3,[7FF8FF205340]
       vmovups   xmm4,[7FF8FF205350]
       vmovaps   xmm5,xmm4
       vbroadcastss xmm5,xmm5
       vmulps    xmm0,xmm0,xmm5
       vmovshdup xmm5,xmm4
       vbroadcastss xmm5,xmm5
       vmulps    xmm1,xmm1,xmm5
       vaddps    xmm0,xmm0,xmm1
       vunpckhps xmm1,xmm4,xmm4
       vbroadcastss xmm1,xmm1
       vmulps    xmm1,xmm2,xmm1
       vaddps    xmm0,xmm0,xmm1
       vaddps    xmm0,xmm0,xmm3
       vmovups   [rdx],xmm0
       mov       rax,rdx
       ret
; Total bytes of code 102

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler
Copy link
Author

performanceautofiler bot commented Feb 7, 2023

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 2ba2396495c22429035d165e478672c442f81e22
Compare 6aa9f8b5a5d7ea4d79715f0b16f2a5b0ab6ac48d
Diff Diff

Improvements in System.Numerics.Tests.Perf_Vector3

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
TransformNormalByMatrix4x4Benchmark - Duration of single invocation 2.65 ns 0.80 ns 0.30 0.18 False Trace Trace
TransformByMatrix4x4Benchmark - Duration of single invocation 3.34 ns 1.05 ns 0.31 0.11 False Trace Trace

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Vector3*'

Payloads

Baseline
Compare

Histogram

System.Numerics.Tests.Perf_Vector3.TransformNormalByMatrix4x4Benchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.7990485963637398 < 2.5307927321336683.
IsChangePoint: Marked as a change because one of 1/6/2023 6:58:01 PM, 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 49.97136763411228 (T) = (0 -0.7998774658760907) / Math.Sqrt((1.1956679399951837 / (299)) + (2.281304938858914E-06 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.7980077590615858 = (3.9599415411207146 - 0.7998774658760907) / 3.9599415411207146 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Vector3.TransformNormalByMatrix4x4Benchmark()
       sub       rsp,48
       vzeroupper
       vmovups   xmm0,[7FFB80435380]
       vmovups   [rsp+8],xmm0
       vmovups   xmm0,[7FFB80435390]
       vmovups   [rsp+18],xmm0
       vmovups   xmm0,[7FFB804353A0]
       vmovups   [rsp+28],xmm0
       vmovups   xmm0,[7FFB804353B0]
       vmovups   [rsp+38],xmm0
       vmovups   xmm0,[7FFB804353C0]
       vmovaps   xmm1,xmm0
       vmulss    xmm2,xmm1,dword ptr [rsp+8]
       vmovshdup xmm3,xmm0
       vmulss    xmm4,xmm3,dword ptr [rsp+18]
       vaddss    xmm2,xmm2,xmm4
       vunpckhps xmm0,xmm0,xmm0
       vmulss    xmm4,xmm0,dword ptr [rsp+28]
       vaddss    xmm2,xmm2,xmm4
       vmulss    xmm4,xmm1,dword ptr [rsp+0C]
       vmulss    xmm5,xmm3,dword ptr [rsp+1C]
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm0,dword ptr [rsp+2C]
       vaddss    xmm4,xmm4,xmm5
       vinsertps xmm2,xmm2,xmm4,10
       vmulss    xmm1,xmm1,dword ptr [rsp+10]
       vmulss    xmm3,xmm3,dword ptr [rsp+20]
       vaddss    xmm1,xmm1,xmm3
       vmulss    xmm0,xmm0,dword ptr [rsp+30]
       vaddss    xmm0,xmm1,xmm0
       vinsertps xmm0,xmm2,xmm0,28
       vmovsd    qword ptr [rdx],xmm0
       vextractps dword ptr [rdx+8],xmm0,2
       mov       rax,rdx
       add       rsp,48
       ret
; Total bytes of code 192

Compare Jit Disasm

; System.Numerics.Tests.Perf_Vector3.TransformNormalByMatrix4x4Benchmark()
       vzeroupper
       vmovups   xmm0,[7FFBB5D15310]
       vmovups   xmm1,[7FFBB5D15320]
       vmovups   xmm2,[7FFBB5D15330]
       vmovups   xmm3,[7FFBB5D15340]
       vmovaps   xmm4,xmm3
       vbroadcastss xmm4,xmm4
       vmulps    xmm0,xmm0,xmm4
       vmovshdup xmm4,xmm3
       vbroadcastss xmm4,xmm4
       vmulps    xmm1,xmm1,xmm4
       vaddps    xmm0,xmm0,xmm1
       vunpckhps xmm1,xmm3,xmm3
       vbroadcastss xmm1,xmm1
       vmulps    xmm1,xmm2,xmm1
       vaddps    xmm0,xmm0,xmm1
       vmovsd    qword ptr [rdx],xmm0
       vextractps dword ptr [rdx+8],xmm0,2
       mov       rax,rdx
       ret
; Total bytes of code 97

System.Numerics.Tests.Perf_Vector3.TransformByMatrix4x4Benchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 1.051784785478769 < 3.1429902565540915.
IsChangePoint: Marked as a change because one of 1/6/2023 6:58:01 PM, 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 56.12778158159004 (T) = (0 -1.0480914358753872) / Math.Sqrt((1.172273686803947 / (299)) + (0.0006597691172665521 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.7708761816212986 = (4.574345187208238 - 1.0480914358753872) / 4.574345187208238 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Vector3.TransformByMatrix4x4Benchmark()
       sub       rsp,48
       vzeroupper
       vmovups   xmm0,[7FFD8D5053A0]
       vmovups   [rsp+8],xmm0
       vmovups   xmm0,[7FFD8D5053B0]
       vmovups   [rsp+18],xmm0
       vmovups   xmm0,[7FFD8D5053C0]
       vmovups   [rsp+28],xmm0
       vmovups   xmm0,[7FFD8D5053D0]
       vmovups   [rsp+38],xmm0
       vmovups   xmm0,[7FFD8D5053E0]
       vmovaps   xmm1,xmm0
       vmulss    xmm2,xmm1,dword ptr [rsp+8]
       vmovshdup xmm3,xmm0
       vmulss    xmm4,xmm3,dword ptr [rsp+18]
       vaddss    xmm2,xmm2,xmm4
       vunpckhps xmm0,xmm0,xmm0
       vmulss    xmm4,xmm0,dword ptr [rsp+28]
       vaddss    xmm2,xmm2,xmm4
       vaddss    xmm2,xmm2,dword ptr [rsp+38]
       vmulss    xmm4,xmm1,dword ptr [rsp+0C]
       vmulss    xmm5,xmm3,dword ptr [rsp+1C]
       vaddss    xmm4,xmm4,xmm5
       vmulss    xmm5,xmm0,dword ptr [rsp+2C]
       vaddss    xmm4,xmm4,xmm5
       vaddss    xmm4,xmm4,dword ptr [rsp+3C]
       vinsertps xmm2,xmm2,xmm4,10
       vmulss    xmm1,xmm1,dword ptr [rsp+10]
       vmulss    xmm3,xmm3,dword ptr [rsp+20]
       vaddss    xmm1,xmm1,xmm3
       vmulss    xmm0,xmm0,dword ptr [rsp+30]
       vaddss    xmm0,xmm1,xmm0
       vaddss    xmm0,xmm0,dword ptr [rsp+40]
       vinsertps xmm0,xmm2,xmm0,28
       vmovsd    qword ptr [rdx],xmm0
       vextractps dword ptr [rdx+8],xmm0,2
       mov       rax,rdx
       add       rsp,48
       ret
; Total bytes of code 210

Compare Jit Disasm

; System.Numerics.Tests.Perf_Vector3.TransformByMatrix4x4Benchmark()
       vzeroupper
       vmovups   xmm0,[7FF9BDB25310]
       vmovups   xmm1,[7FF9BDB25320]
       vmovups   xmm2,[7FF9BDB25330]
       vmovups   xmm3,[7FF9BDB25340]
       vmovups   xmm4,[7FF9BDB25350]
       vmovaps   xmm5,xmm4
       vbroadcastss xmm5,xmm5
       vmulps    xmm0,xmm0,xmm5
       vmovshdup xmm5,xmm4
       vbroadcastss xmm5,xmm5
       vmulps    xmm1,xmm1,xmm5
       vaddps    xmm0,xmm0,xmm1
       vunpckhps xmm1,xmm4,xmm4
       vbroadcastss xmm1,xmm1
       vmulps    xmm1,xmm2,xmm1
       vaddps    xmm0,xmm0,xmm1
       vaddps    xmm0,xmm0,xmm3
       vmovsd    qword ptr [rdx],xmm0
       vextractps dword ptr [rdx+8],xmm0,2
       mov       rax,rdx
       ret
; Total bytes of code 109

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS Windows 10.0.18362
Baseline 2ba2396495c22429035d165e478672c442f81e22
Compare 6aa9f8b5a5d7ea4d79715f0b16f2a5b0ab6ac48d
Diff Diff

Improvements in System.Numerics.Tests.Perf_Vector2

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
TransformByMatrix4x4Benchmark - Duration of single invocation 1.54 ns 0.40 ns 0.26 0.20 False 22.958374669579698 12.865332047514357 0.5603764305040799 Trace Trace

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Vector2*'

Payloads

Baseline
Compare

Histogram

System.Numerics.Tests.Perf_Vector2.TransformByMatrix4x4Benchmark


Description of detection logic

IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsRegressionBase: Marked as not a regression because the compare was not 5% greater than the baseline, or the value was too small.
IsImprovementBase: Marked as improvement because the compare was 5% less than the baseline, and the value was not too small.
IsImprovementCheck: Marked as improvement because the three check build points were 0.05 less than the baseline.
IsImprovementWindowed:Marked as improvement because 0.40277414014951346 < 1.464601560612459.
IsChangePoint: Marked as a change because one of 1/6/2023 6:58:01 PM, 2/2/2023 5:46:53 AM, 2/7/2023 2:48:42 AM falls between 1/29/2023 12:56:15 AM and 2/7/2023 2:48:42 AM.
IsImprovementStdDev: Marked as improvement because 53.696927945589685 (T) = (0 -0.41671828698900937) / Math.Sqrt((0.28108519512271274 / (299)) + (0.004877744052787956 / (25))) is greater than 1.9673585853226652 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (299) + (25) - 2, .975) and 0.8127874146311673 = (2.2259095785041434 - 0.41671828698900937) / 2.2259095785041434 is greater than 0.05.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```### Baseline Jit Disasm

```assembly
; System.Numerics.Tests.Perf_Vector2.TransformByMatrix4x4Benchmark()
       sub       rsp,48
       vzeroupper
       vmovups   xmm0,[7FFDDD5A5290]
       vmovups   [rsp+8],xmm0
       vmovups   xmm0,[7FFDDD5A52A0]
       vmovups   [rsp+18],xmm0
       vmovups   xmm0,[7FFDDD5A52B0]
       vmovups   [rsp+28],xmm0
       vmovups   xmm0,[7FFDDD5A52C0]
       vmovups   [rsp+38],xmm0
       vmovsd    xmm0,qword ptr [7FFDDD5A52D0]
       vmovaps   xmm1,xmm0
       vmulss    xmm2,xmm1,dword ptr [rsp+8]
       vmovshdup xmm0,xmm0
       vmulss    xmm3,xmm0,dword ptr [rsp+18]
       vaddss    xmm2,xmm2,xmm3
       vaddss    xmm2,xmm2,dword ptr [rsp+38]
       vmulss    xmm1,xmm1,dword ptr [rsp+0C]
       vmulss    xmm0,xmm0,dword ptr [rsp+1C]
       vaddss    xmm0,xmm1,xmm0
       vaddss    xmm0,xmm0,dword ptr [rsp+3C]
       vinsertps xmm0,xmm2,xmm0,1C
       vmovq     rax,xmm0
       add       rsp,48
       ret
; Total bytes of code 139

Compare Jit Disasm

; System.Numerics.Tests.Perf_Vector2.TransformByMatrix4x4Benchmark()
       vzeroupper
       vmovups   xmm0,[7FFCB5D85230]
       vmovups   xmm1,[7FFCB5D85240]
       vmovups   xmm2,[7FFCB5D85250]
       vmovsd    xmm3,qword ptr [7FFCB5D85260]
       vmovaps   xmm4,xmm3
       vbroadcastss xmm4,xmm4
       vmulps    xmm0,xmm0,xmm4
       vmovshdup xmm3,xmm3
       vbroadcastss xmm3,xmm3
       vmulps    xmm1,xmm1,xmm3
       vaddps    xmm0,xmm0,xmm1
       vaddps    xmm0,xmm0,xmm2
       vmovq     rax,xmm0
       ret
; Total bytes of code 75

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added CoreClr PGO Applied if there were any profile guided optimization updates in the observed interval. untriaged labels Feb 7, 2023
@AndyAyersMS
Copy link
Member

dotnet/runtime#81335

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-x64 branch-refs/heads/main kind-micro os-windows perf-improvement PGO Applied if there were any profile guided optimization updates in the observed interval. runtime-coreclr
Projects
None yet
Development

No branches or pull requests

2 participants