Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure maxSIMDStructBytes doesn't report compVerifyInstructionSetUnusable #85370

Merged
merged 1 commit into from
May 6, 2023

Conversation

tannergooding
Copy link
Member

This resolves #84923

Previously a couple paths were calling compOpportunisticallyDependsOn and then calling compVerifyInstructionSetUnusable if it failed. This is logically the same as just calling compExactlyDependsOn.

For getSIMDVectorRegisterByteLength this is expected as we have a distinct ABI and functionality difference for Vector<T> based on whether or not AVX2 is supported and so I've switched to just calling compExactlyDependsOn directly.

For maxSIMDStructBytes this was "unexpected" as we are only doing an opportunistic check for a few internal locations. This for example is used as part of local struct promotion and handling larger amounts of memory as part of block copy, memory comparison, or zeroing. As such, I've ensured it only calls compOpportunisticallyDependsOn.

For the latter, it had a few primary users:

  • roundUpSIMDSize(size) - used as part of genCodeForMemmove(tree)
  • roundDownSIMDSize(size) - used as part of BlkOpKindUnrollMemmove, impExpandHalfConstEqualsSIMD(...), and genCodeForMemmove(tree)
  • getUnrollThreshold(type, canUseSimd) - used in various places aspart of Memmove, Memset, and Memcpy
  • structSizeMightRepresentSIMDType(structSize) - used in various places to filter whether something could be a SIMD type or not (previously just guessed by using the largest possible vector size, regardless of support)
  • CanPromoteStructType(typeHnd)

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 26, 2023
@ghost ghost assigned tannergooding Apr 26, 2023
@ghost
Copy link

ghost commented Apr 26, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This resolves #84923

Previously a couple paths were calling compOpportunisticallyDependsOn and then calling compVerifyInstructionSetUnusable if it failed. This is logically the same as just calling compExactlyDependsOn.

For getSIMDVectorRegisterByteLength this is expected as we have a distinct ABI and functionality difference for Vector<T> based on whether or not AVX2 is supported and so I've switched to just calling compExactlyDependsOn directly.

For maxSIMDStructBytes this was "unexpected" as we are only doing an opportunistic check for a few internal locations. This for example is used as part of local struct promotion and handling larger amounts of memory as part of block copy, memory comparison, or zeroing. As such, I've ensured it only calls compOpportunisticallyDependsOn.

For the latter, it had a few primary users:

  • roundUpSIMDSize(size) - used as part of genCodeForMemmove(tree)
  • roundDownSIMDSize(size) - used as part of BlkOpKindUnrollMemmove, impExpandHalfConstEqualsSIMD(...), and genCodeForMemmove(tree)
  • getUnrollThreshold(type, canUseSimd) - used in various places aspart of Memmove, Memset, and Memcpy
  • structSizeMightRepresentSIMDType(structSize) - used in various places to filter whether something could be a SIMD type or not (previously just guessed by using the largest possible vector size, regardless of support)
  • CanPromoteStructType(typeHnd)
Author: tannergooding
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@tannergooding
Copy link
Member Author

For maxSIMDStructBytes this was "unexpected"

Worth noting we didn't always do this for maxSIMDStructBytes either, this was more recent and looks to have crept in as part of other refactorings done as part of AVX512 lightup.

@kunalspathak
Copy link
Member

/benchmark json aspnet-citrine-lin runtime

@pr-benchmarks
Copy link

pr-benchmarks bot commented Apr 26, 2023

Benchmark started for json on aspnet-citrine-lin with runtime. Logs: link

@tannergooding
Copy link
Member Author

application
CPU Usage (%) 99
Cores usage (%) 2,778
Working Set (MB) 98
Private Memory (MB) 622
Build Time (ms) 8,749
Start Time (ms) 227
Published Size (KB) 95,944
Symbols Size (KB) 54
.NET Core SDK Version 8.0.100-preview.5.23226.2
ASP.NET Core Version 8.0.0-preview.5.23226.2+cc54ac3
.NET Runtime Version 8.0.0-preview.4.23226.3+e5798eb
load
CPU Usage (%) 80
Cores usage (%) 2,230
Working Set (MB) 48
Private Memory (MB) 363
Start Time (ms) 0
First Request (ms) 163
Requests/sec 1,226,263

@kunalspathak
Copy link
Member

Seems we don't have linux agent configured correctly and hence doing cmd.exe /c ./build.sh which fails. For now, let me trigger windows run.

@kunalspathak
Copy link
Member

/benchmark json aspnet-citrine-win runtime

@pr-benchmarks
Copy link

pr-benchmarks bot commented Apr 26, 2023

Benchmark started for json on aspnet-citrine-win with runtime. Logs: link

@kunalspathak
Copy link
Member

/benchmark json aspnet-citrine-lin runtime

@pr-benchmarks
Copy link

pr-benchmarks bot commented Apr 26, 2023

Benchmark started for json on aspnet-citrine-lin with runtime. Logs: link

@kunalspathak
Copy link
Member

It again failed to log the results for windows, but here is the comparison:

2023-04-26T18:18:11.6866980Z | application           | json.base                 | json.pr                   |         |
2023-04-26T18:18:11.6870299Z | --------------------- | ------------------------- | ------------------------- | ------- |
2023-04-26T18:18:11.6870685Z | CPU Usage (%)         |                        77 |                        80 |  +3.90% |
2023-04-26T18:18:11.6871009Z | Cores usage (%)       |                     2,153 |                     2,244 |  +4.23% |
2023-04-26T18:18:11.6871319Z | Working Set (MB)      |                        80 |                        80 |   0.00% |
2023-04-26T18:18:11.6871640Z | Private Memory (MB)   |                       102 |                       102 |   0.00% |
2023-04-26T18:18:11.6875482Z | Build Time (ms)       |                     7,573 |                     9,214 | +21.67% |
2023-04-26T18:18:11.6876001Z | Start Time (ms)       |                       362 |                       361 |  -0.28% |
2023-04-26T18:18:11.6876332Z | Published Size (KB)   |                    98,207 |                    98,207 |   0.00% |
2023-04-26T18:18:11.6876654Z | Symbols Size (KB)     |                        54 |                        54 |   0.00% |
2023-04-26T18:18:11.6877134Z | .NET Core SDK Version | 8.0.100-preview.5.23226.2 | 8.0.100-preview.5.23226.2 |         |
2023-04-26T18:18:11.6886737Z 
2023-04-26T18:18:11.6893172Z 
2023-04-26T18:18:11.6908783Z | load                   | json.base  | json.pr    |         |
2023-04-26T18:18:11.6924097Z | ---------------------- | ---------- | ---------- | ------- |
2023-04-26T18:18:11.6930263Z | CPU Usage (%)          |         74 |         74 |   0.00% |
2023-04-26T18:18:11.6973236Z | Cores usage (%)        |      2,086 |      2,079 |  -0.34% |
2023-04-26T18:18:11.7103106Z | Working Set (MB)       |         48 |         48 |   0.00% |
2023-04-26T18:18:11.7147126Z | Private Memory (MB)    |        363 |        363 |   0.00% |
2023-04-26T18:18:11.7147987Z | Start Time (ms)        |          0 |          0 |         |
2023-04-26T18:18:11.7148332Z | First Request (ms)     |        150 |        153 |  +2.00% |
2023-04-26T18:18:11.7148844Z | Requests/sec           |  1,106,645 |  1,071,702 |  -3.16% |
2023-04-26T18:18:11.7149285Z | Requests               | 16,709,976 | 16,182,414 |  -3.16% |
2023-04-26T18:18:11.7151307Z | Mean latency (ms)      |       1.05 |       0.98 |  -6.67% |
2023-04-26T18:18:11.7151831Z | Max latency (ms)       |      53.66 |      44.31 | -17.42% |
2023-04-26T18:18:11.7152164Z | Bad responses          |          0 |          0 |         |
2023-04-26T18:18:11.7152458Z | Socket errors          |          0 |          0 |         |
2023-04-26T18:18:11.7153780Z | Read throughput (MB/s) |     154.09 |     149.22 |  -3.16% |
2023-04-26T18:18:11.7165085Z | Latency 50th (ms)      |       0.30 |       0.31 |  +2.95% |
2023-04-26T18:18:11.7201727Z | Latency 75th (ms)      |       0.80 |       0.81 |  +0.62% |
2023-04-26T18:18:11.7204554Z | Latency 90th (ms)      |       2.84 |       2.57 |  -9.51% |
2023-04-26T18:18:11.7205842Z | Latency 99th (ms)      |       9.57 |       8.89 |  -7.11% |

@sebastienros
Copy link
Member

/benchmark json aspnet-citrine-lin runtime

@pr-benchmarks
Copy link

pr-benchmarks bot commented Apr 27, 2023

Benchmark started for json on aspnet-citrine-lin with runtime. Logs: link

@sebastienros
Copy link
Member

/benchmark json aspnet-citrine-lin runtime

@pr-benchmarks
Copy link

pr-benchmarks bot commented Apr 27, 2023

Benchmark started for json on aspnet-citrine-lin with runtime. Logs: link

@tannergooding
Copy link
Member Author

Ping @dotnet/jit-contrib, @davidwrighton for review/feedback

@tannergooding tannergooding merged commit 7bd4666 into dotnet:main May 6, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TE startup/time-to-first-request regression
4 participants