Avoid allocations for Unicode data tries #15074
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does the pull request do?
This PR completely avoids allocations for Unicode data tries on .NET 7+, removing 195 kB of always allocated managed memory.
How was the solution implemented (if it's not obvious)?
The trie's data is now a property returning
ReadOnlySpan<uint>
instead ofReadOnlySpan<byte>
, that's used directly instead of copying the bytes to a runtimeuint[]
.Starting with .NET 7 and Roslyn 17.5, this syntax makes the compiler use
RuntimeHelpers.CreateSpan
, which references the data embedded inside the assembly directly on little-endian systems, without having to allocate or copy anything!(For .NET Standard, the compiler falls back to an array.)
Performance
I expected better memory usage, but it turns out we also get some performance improvement. With these changes, the JIT is able to inline some constants from the Unicode data directly, generating better code.
Before:
After: