Adding support for vector constants via GenTreeVecCon #68874

tannergooding · 2022-05-04T22:03:45Z

This adds direct support for vector constant nodes via GenTreeVecCon. It involved quite a bit of cleanup to normalize the places that were touching GT_SIMD, those that were touching GT_HWINTRINSIC, and those that touched both so it grew up a bit more than I initially desired.

The result, overall, however is that vector constants are now centrally handled with less overall allocations and nodes required to represent them. They are also now base type independent which allows more CSE opportunities and which means that you can easily see the bits however the user needs to interpret them. This in turn will simplify logic for the xplat shuffle APIs where otherwise simple operations, such as .AsInt32() would break the handling and force it down the fallback path.

ghost · 2022-05-04T22:03:55Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

This is a draft that adds direct support for vector constant nodes via GenTreeVecCon. It involved quite a bit of cleanup to normalize the places that were touching GT_SIMD, those that were touching GT_HWINTRINSIC, and those that touched both so it grew up a bit more than I initially desired.

The result, overall, however is that vector constants are now centrally handled with less overall allocations and nodes required to represent them. They are also now base type independent which allows more CSE opportunities and which means that you can easily see the bits however the user needs to interpret them. This in turn will simplify logic for the xplat shuffle APIs where otherwise simple operations, such as .AsInt32() would break the handling and force it down the fallback path.

Author:	tannergooding
Assignees:	tannergooding
Labels:	`area-CodeGen-coreclr`
Milestone:	-

tannergooding · 2022-05-04T22:57:01Z

Will re-open after I get tests passing.

tannergooding · 2022-05-10T20:23:31Z

CC. @dotnet/jit-contrib

This is needed to correctly handle the Shuffle APIs and some of the more complex patterns around vector constant nodes that pop up (part of the remaining xplat hwintrinsic APIs: #63331).

It provides some throughput gains on all platforms and spot-checking the regressions they are mostly from new CSE opportunities and other optimizations that can now kick in due to knowing these are constant. There are a couple regressions where we emit additional xorps instructions since its considered cheap enough CSE doesn't always kick in (this issue is more broadly tracked by #6264).

src/coreclr/jit/valuenum.cpp

SingleAccretion · 2022-05-10T20:55:08Z

src/coreclr/jit/valuenum.h

+    else
+    {
+        assert(false);
+        return {};


It would be nice to define a simd8/16/32_t::Zero-like static constexpr field (or function), semantics of { } can be unclear to people not intimately familiar with the C++ initialization rules.

src/coreclr/jit/assertionprop.cpp

SingleAccretion

Some initial feedback; will continue the review tomorrow.

src/coreclr/jit/rationalize.cpp

src/coreclr/jit/valuenum.cpp

src/coreclr/jit/assertionprop.cpp

src/coreclr/jit/emit.cpp

src/coreclr/jit/importer.cpp

src/coreclr/jit/morph.cpp

src/coreclr/jit/rationalize.cpp

src/coreclr/jit/valuenum.h

src/coreclr/jit/gentree.h

src/coreclr/jit/lsraarm64.cpp

src/coreclr/jit/lowerarmarch.cpp

src/coreclr/jit/lowerloongarch64.cpp

src/coreclr/jit/hwintrinsicxarch.cpp

src/coreclr/jit/assertionprop.cpp

src/coreclr/jit/gentree.h

src/coreclr/jit/lowerxarch.cpp

Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>

SingleAccretion

The frontend/IR changes look good. I did not drill too much into the backend parts, but they look mostly mechanical (and correct).

Overall, I think this change is a good step in the right direction w.r.t. SIMDs in IR and removes a good amount of suboptimal representation that we had for constant vectors. Thank you for making it happen!

tannergooding · 2022-05-31T17:53:46Z

CC. @dotnet/jit-contrib. This should be ready for review.

src/coreclr/jit/assertionprop.cpp

TIHan

@SingleAccretion did an incredible review of this.

The changes look great. Only a minor comment regarding the CORINFO_TYPE_FLOAT.

For ARM64 diffs, the regressions seem to just appear as regressions due to new CSE opportunities - so that all looks good to me.

AndyAyersMS · 2022-06-07T18:26:07Z

Improvements:

mrsharm · 2022-08-11T20:45:24Z

@tannergooding - from our analysis while creating the perf report for August, we found the following regression that seemed to line up with this PR specifically related to the Ubuntu 18.04 x64 configuration. Would you consider these regressions as "by design" as there are closed auto-filed regressions above?

System.Numerics.Tests.Perf_Matrix4x4.CreateShadowBenchmark

Result	Ratio	Alloc Delta	Operating System	Bit	Processor Name
Same	1.00	+0	Windows 11	Arm64	Microsoft SQ1 3.0 GHz
Same	1.01	+0	Windows 11	Arm64	Microsoft SQ1 3.0 GHz
Slower	0.29	+0	macOS Monterey 12.3	Arm64	Apple M1 Max
Same	1.01	+0	Windows 10	X64	Intel Xeon CPU E5-1650 v4 3.60GHz
Same	1.03	+0	Windows 10	X64	Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same	1.04	+0	Windows 10	X64	Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same	0.99	+0	Windows 10	X64	Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same	1.02	+0	Windows 10	X64	Intel Core i9-10900K CPU 3.70GHz
Same	1.06	+0	Windows 11	X64	AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same	1.03	+0	Windows 11	X64	AMD Ryzen 9 3950X
Same	0.98	+0	Windows 11	X64	AMD Ryzen 9 5900X
Same	0.95	+0	Windows 11	X64	AMD Ryzen 9 5950X
Same	0.94	+0	Windows 11	X64	Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same	0.95	+0	Windows 11	X64	Intel Core i9-10900K CPU 3.70GHz
Same	0.92	+0	Windows 11	X64	11th Gen Intel Core i9-11900H 2.50GHz
Slower	0.68	+0	ubuntu 18.04	X64	Intel Xeon CPU E5-1650 v4 3.60GHz
Slower	0.86	+0	ubuntu 18.04	X64	Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower	0.62	+0	ubuntu 18.04	X64	Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower	0.68	+0	ubuntu 20.04	X64	AMD Ryzen 9 5900X
Slower	0.58	+0	ubuntu 20.04	X64	Intel Core i9-10900K CPU 3.70GHz
Faster	1.54	+0	Windows 10	X86	Intel Xeon CPU E5-1650 v4 3.60GHz
Same	1.00	+0	Windows 10	X86	Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same	0.99	+0	Windows 11	X86	AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower	0.65	+0	macOS Big Sur 11.6.8	X64	Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower	0.64	+0	macOS Monterey 12.3.1	X64	Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower	0.65	+0	macOS Monterey 12.4	X64	Intel Core i5-4278U CPU 2.60GHz (Haswell)

tannergooding · 2022-08-11T20:59:54Z

Not by design, would need to see the disassembly to see exactly what's being pessimized here.

This isn't important for .NET 7, however. Matrix4x4.CreateShadow is a case that isn't accelerated today and where the overall codegen is already suboptimal. The "proper" fix would be to rewrite the implementation to properly take advantage of the hardware intrinsics where possible.

ghost assigned tannergooding May 4, 2022

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 4, 2022

tannergooding closed this May 4, 2022

tannergooding reopened this May 5, 2022

tannergooding force-pushed the vector-cns branch 5 times, most recently from 4921110 to 360ff7e Compare May 9, 2022 14:40

Adding support for vector constants via GenTreeVecCon

d30818a

tannergooding force-pushed the vector-cns branch from 360ff7e to d30818a Compare May 10, 2022 13:57

tannergooding marked this pull request as ready for review May 10, 2022 20:19

SingleAccretion reviewed May 10, 2022

View reviewed changes

src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved

SingleAccretion reviewed May 10, 2022

View reviewed changes

src/coreclr/jit/assertionprop.cpp Outdated Show resolved Hide resolved

SingleAccretion reviewed May 10, 2022

View reviewed changes

SingleAccretion reviewed May 11, 2022

View reviewed changes

tannergooding mentioned this pull request May 20, 2022

Optimize System.Buffers for arm64 using cross-platform intrinsics #35033

Closed

2 tasks

tannergooding added 6 commits May 20, 2022 07:41

Merge remote-tracking branch 'dotnet/main' into vector-cns

2b82d9a

Responding to PR feedback

adbd04b

Merge remote-tracking branch 'dotnet/main' into vector-cns

55e44d6

Support tracking the underlying simdBaseJitType for GenTreeVecCon

db39727

Merge remote-tracking branch 'dotnet/main' into vector-cns

c4b1882

Applying formatting patch

5b7d3e4

SingleAccretion mentioned this pull request May 25, 2022

Eliminate the need to preserve struct handles for SIMD types #69822

Closed

tannergooding added 2 commits May 25, 2022 19:36

Merge remote-tracking branch 'dotnet/main' into vector-cns

afd0ca1

Merge remote-tracking branch 'dotnet/main' into vector-cns

52a3b70

Update src/coreclr/jit/gentree.cpp

ac4dc2b

Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>

SingleAccretion approved these changes May 31, 2022

View reviewed changes

TIHan reviewed May 31, 2022

View reviewed changes

src/coreclr/jit/assertionprop.cpp Show resolved Hide resolved

TIHan approved these changes May 31, 2022

View reviewed changes

tannergooding mentioned this pull request May 31, 2022

Investigate removing gtSimdBaseJitType from GenTreeVecCon #70052

Closed

tannergooding merged commit 187bb1f into dotnet:main May 31, 2022

DrewScoggins mentioned this pull request Jun 2, 2022

[Perf] Changes at 5/28/2022 1:41:21 AM #70162

Closed

tannergooding mentioned this pull request Jun 2, 2022

Ensure that GT_CNS_VEC is handled in LinearScan::isMatchingConstant #70171

Merged

This was referenced Jun 5, 2022

Assertion failed 'varTypeIsSIMD(type)' #70260

Closed

Test failure: System.Numerics.Tests.Matrix3x2Tests.Matrix3x2CreateRotationCenterTest #70261

Closed

SingleAccretion mentioned this pull request Jun 5, 2022

Test failure System.Numerics.Tests.Matrix3x2Tests.Matrix3x2CreateRotationCenterTest #70124

Closed

This was referenced Jun 7, 2022

[Perf] Changes at 6/3/2022 1:48:25 AM dotnet/perf-autofiling-issues#5807

Closed

[Perf] Changes at 5/31/2022 9:21:33 PM dotnet/perf-autofiling-issues#5804

Closed

AndyAyersMS mentioned this pull request Jun 7, 2022

Regressions from vector constants change #70368

Closed

This was referenced Jun 7, 2022

[Perf] Changes at 6/2/2022 9:03:53 PM dotnet/perf-autofiling-issues#5883

Closed

[Perf] Changes at 5/31/2022 9:21:33 PM dotnet/perf-autofiling-issues#5830

Closed

AndyAyersMS mentioned this pull request Jun 7, 2022

[Perf] Changes at 5/31/2022 9:21:33 PM dotnet/perf-autofiling-issues#5872

Closed

JulieLeeMSFT added this to the 7.0.0 milestone Jun 8, 2022

This was referenced Jun 9, 2022

[Perf] Changes at 6/2/2022 10:27:31 PM #70499

Closed

Regressions in System.Text.Tests.Perf_Encoding and Perf_Encoders #70501

Closed

JulieLeeMSFT mentioned this pull request Jul 7, 2022

What's new in .NET 7 Preview 6 [WIP] dotnet/core#7454

Closed

ghost locked as resolved and limited conversation to collaborators Jul 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for vector constants via GenTreeVecCon #68874

Adding support for vector constants via GenTreeVecCon #68874

tannergooding commented May 4, 2022 •

edited

Loading

ghost commented May 4, 2022

tannergooding commented May 4, 2022

tannergooding commented May 10, 2022

SingleAccretion May 10, 2022

SingleAccretion left a comment

SingleAccretion left a comment

tannergooding commented May 31, 2022

TIHan left a comment

AndyAyersMS commented Jun 7, 2022 •

edited

Loading

mrsharm commented Aug 11, 2022

tannergooding commented Aug 11, 2022

Adding support for vector constants via GenTreeVecCon #68874

Adding support for vector constants via GenTreeVecCon #68874

Conversation

tannergooding commented May 4, 2022 • edited Loading

ghost commented May 4, 2022

tannergooding commented May 4, 2022

tannergooding commented May 10, 2022

SingleAccretion May 10, 2022

Choose a reason for hiding this comment

SingleAccretion left a comment

Choose a reason for hiding this comment

SingleAccretion left a comment

Choose a reason for hiding this comment

tannergooding commented May 31, 2022

TIHan left a comment

Choose a reason for hiding this comment

AndyAyersMS commented Jun 7, 2022 • edited Loading

mrsharm commented Aug 11, 2022

tannergooding commented Aug 11, 2022

tannergooding commented May 4, 2022 •

edited

Loading

AndyAyersMS commented Jun 7, 2022 •

edited

Loading