Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some overhead from SymbolicRegexMatcher #105668

Merged
merged 1 commit into from
Jul 30, 2024

Conversation

stephentoub
Copy link
Member

Fixes #104975
cc: @ieviev, @veanes

  • Avoid try/finally blocks, instead using breaks to break out of a loop to a single return
  • Consolidate two almost identical methods for DFA transitions
  • Avoid an unnecessary GetPositionKind call that could instead use a hardcoded index
  • Avoid a duplicate read for a ref parameter
  • Outline a throw
Type Toolchain Pattern Mean Ratio
Perf_Regex_Industry_RustLang_Sherlock \main\corerun.exe \s[a-zA-Z]{0,12}ing\s 2,697.27 us 1.00
Perf_Regex_Industry_RustLang_Sherlock \pr\corerun.exe \s[a-zA-Z]{0,12}ing\s 2,495.75 us 0.93
Perf_Regex_Industry_RustLang_Sherlock \main\corerun.exe \w+\s+Holmes 2,456.07 us 1.00
Perf_Regex_Industry_RustLang_Sherlock \pr\corerun.exe \w+\s+Holmes 2,217.08 us 0.90
Perf_Regex_Industry_RustLang_Sherlock \main\corerun.exe \w+\s+Holmes\s+\w+ 2,445.44 us 1.00
Perf_Regex_Industry_RustLang_Sherlock \pr\corerun.exe \w+\s+Holmes\s+\w+ 2,208.26 us 0.90
Perf_Regex_Industry_RustLang_Sherlock \main\corerun.exe Sherlock|Holmes 127.52 us 1.00
Perf_Regex_Industry_RustLang_Sherlock \pr\corerun.exe Sherlock|Holmes 117.73 us 0.92
Perf_Regex_Industry_RustLang_Sherlock \main\corerun.exe Sherlock|Holmes|Watson 174.69 us 1.00
Perf_Regex_Industry_RustLang_Sherlock \pr\corerun.exe Sherlock|Holmes|Watson 171.71 us 0.99
Perf_Regex_Industry_RustLang_Sherlock \main\corerun.exe Sherlock|Holm(...)er|John|Baker [45] 1,640.95 us 1.01
Perf_Regex_Industry_RustLang_Sherlock \pr\corerun.exe Sherlock|Holm(...)er|John|Baker [45] 1,528.37 us 0.94
Perf_Regex_Industry_RustLang_Sherlock \main\corerun.exe Sherlock|Street 52.88 us 1.00
Perf_Regex_Industry_RustLang_Sherlock \pr\corerun.exe Sherlock|Street 52.34 us 0.99
Perf_Regex_Industry_Leipzig \main\corerun.exe .{0,2}(Tom|Sawyer|Huckleberry|Finn) 63,165.09 us 1.00
Perf_Regex_Industry_Leipzig \pr\corerun.exe .{0,2}(Tom|Sawyer|Huckleberry|Finn) 58,308.39 us 0.92
Perf_Regex_Industry_Leipzig \main\corerun.exe .{2,4}(Tom|Sawyer|Huckleberry|Finn) 62,901.15 us 1.00
Perf_Regex_Industry_Leipzig \pr\corerun.exe .{2,4}(Tom|Sawyer|Huckleberry|Finn) 54,967.23 us 0.87

- Avoid try/finally blocks, instead using breaks to break out of a loop to a single return
- Consolidate two almost identical methods for DFA transitions
- Avoid an unnecessary GetPositionKind call that could instead use a hardcoded index
- Avoid a duplicate read for a ref parameter
- Outline a throw
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

@stephentoub stephentoub changed the title Remove some overhead from SymbolcRegexMatcher Remove some overhead from SymbolicRegexMatcher Jul 30, 2024
Copy link
Contributor

@veanes veanes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good, avoid some small overheads that overall add up.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Perf] Linux/x64: 2 Regressions in Regex on 7/11/2024 6:16:28 PM
2 participants