Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegExp syntax checking performance #58339

Merged
merged 8 commits into from
Apr 29, 2024
Merged

RegExp syntax checking performance #58339

merged 8 commits into from
Apr 29, 2024

Conversation

rbuckton
Copy link
Member

@rbuckton rbuckton commented Apr 27, 2024

This makes a few changes to the RegExp syntax checking algorithm for possible performance improvements. This also fixes a small typo in one of the RegExp diagnostic messages.

One of the changes is to ensure we perform proper bounds checks when pos might advance past end. To accomplish this, I added the charCodeChecked and charCodeUnchecked functions. charCodeChecked performs boundary tests which both improves reliability and avoids potential deoptimizations due to reading past the end of a string. charCodeUnchecked is simply a wrapper for text.charCodeAt and exists primarily to make it easier to readily identify unchecked vs checked reads from text. charCodeUnchecked should only be used when we've already performed a bounds check before reading a character, such as in a while (pos < end) loop. I've also done the same for code points with codePointChecked and codePointUnchecked. Each of these functions is small, so ideally will be inlined by an optimizing compiler like the one used by V8.

I've primarily focused on optimizing scanRegularExpressionWorker for now. If this seems effective, I will investigate adopting charCodeChecked in more places in the scanner in a later PR.

I've also made a few other small perf improvements:

  • We now delay allocations for capture group tracking until they are necessary.
  • scanRegularExpressionWorker has been moved out of reScanSlashToken and updated to use the existing text and end values from the outer scope.
  • A few frequently referenced variables from the RegExp scanner have been switched to var to avoid TDZ check overhead.

@typescript-bot typescript-bot added Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Apr 27, 2024
@rbuckton
Copy link
Member Author

@typescript-bot perf test

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 27, 2024

Starting jobs; this comment will be updated as builds start and complete.

Command Status Results
perf test ✅ Started ❌ Results

@typescript-bot
Copy link
Collaborator

@rbuckton, the perf run you requested failed. You can check the log here.

@typescript-bot
Copy link
Collaborator

@rbuckton
The results of the perf run you requested are in!

Here they are:

tsc

Comparison Report - baseline..pr
Metric baseline pr Delta Best Worst p-value
Compiler-Unions - node (v18.15.0, x64)
Errors 30 30 ~ ~ ~ p=1.000 n=6
Symbols 62,154 62,154 ~ ~ ~ p=1.000 n=6
Types 50,273 50,273 ~ ~ ~ p=1.000 n=6
Memory used 192,825k (± 0.74%) 192,796k (± 0.75%) ~ 192,135k 195,762k p=0.521 n=6
Parse Time 2.02s (± 1.38%) 1.95s (± 0.46%) 🟩-0.07s (- 3.62%) 1.94s 1.96s p=0.005 n=6
Bind Time 1.07s (± 1.09%) 1.07s (± 0.96%) ~ 1.06s 1.08s p=0.437 n=6
Check Time 14.04s (± 0.08%) 14.08s (± 0.29%) +0.04s (+ 0.30%) 14.04s 14.13s p=0.040 n=6
Emit Time 3.87s (± 0.90%) 3.88s (± 0.44%) ~ 3.86s 3.90s p=0.627 n=6
Total Time 21.00s (± 0.09%) 20.98s (± 0.23%) ~ 20.91s 21.04s p=0.686 n=6
angular-1 - node (v18.15.0, x64)
Errors 5 5 ~ ~ ~ p=1.000 n=6
Symbols 945,172 945,172 ~ ~ ~ p=1.000 n=6
Types 408,068 408,068 ~ ~ ~ p=1.000 n=6
Memory used 1,222,036k (± 0.00%) 1,222,022k (± 0.00%) ~ 1,221,973k 1,222,117k p=0.378 n=6
Parse Time 6.92s (± 0.50%) 6.78s (± 0.50%) -0.15s (- 2.14%) 6.75s 6.84s p=0.005 n=6
Bind Time 1.87s (± 0.71%) 1.87s (± 0.28%) ~ 1.87s 1.88s p=0.672 n=6
Check Time 31.41s (± 0.32%) 31.36s (± 0.17%) ~ 31.27s 31.42s p=0.295 n=6
Emit Time 14.57s (± 0.82%) 14.65s (± 0.75%) ~ 14.56s 14.86s p=0.809 n=6
Total Time 54.77s (± 0.42%) 54.66s (± 0.33%) ~ 54.52s 55.00s p=0.297 n=6
mui-docs - node (v18.15.0, x64)
Errors 5 5 ~ ~ ~ p=1.000 n=6
Symbols 1,954,635 1,954,635 ~ ~ ~ p=1.000 n=6
Types 676,418 676,418 ~ ~ ~ p=1.000 n=6
Memory used 1,753,442k (± 0.00%) 1,753,451k (± 0.00%) ~ 1,753,422k 1,753,484k p=0.630 n=6
Parse Time 6.89s (± 0.31%) 6.73s (± 0.35%) -0.17s (- 2.42%) 6.70s 6.76s p=0.005 n=6
Bind Time 2.31s (± 0.18%) 2.31s (± 0.33%) ~ 2.30s 2.32s p=1.000 n=6
Check Time 56.81s (± 0.11%) 56.85s (± 0.45%) ~ 56.63s 57.29s p=0.470 n=6
Emit Time 0.14s (± 2.88%) 0.14s ~ ~ ~ p=0.405 n=6
Total Time 66.16s (± 0.13%) 66.03s (± 0.38%) ~ 65.82s 66.43s p=0.297 n=6
self-build-src - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 1,215,567 1,215,606 +39 (+ 0.00%) ~ ~ p=0.001 n=6
Types 257,612 257,638 +26 (+ 0.01%) ~ ~ p=0.001 n=6
Memory used 2,323,147k (± 0.03%) 2,323,519k (± 0.02%) ~ 2,323,071k 2,324,130k p=0.471 n=6
Parse Time 7.56s (± 0.61%) 7.44s (± 1.06%) -0.12s (- 1.59%) 7.34s 7.57s p=0.016 n=6
Bind Time 2.75s (± 0.48%) 2.75s (± 0.85%) ~ 2.73s 2.78s p=0.936 n=6
Check Time 49.78s (± 0.59%) 49.66s (± 0.78%) ~ 49.29s 50.38s p=0.471 n=6
Emit Time 3.91s (± 1.63%) 3.95s (± 3.43%) ~ 3.85s 4.22s p=0.810 n=6
Total Time 64.01s (± 0.48%) 63.82s (± 0.61%) ~ 63.55s 64.57s p=0.173 n=6
self-build-src-public-api - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 1,215,567 1,215,606 +39 (+ 0.00%) ~ ~ p=0.001 n=6
Types 257,612 257,638 +26 (+ 0.01%) ~ ~ p=0.001 n=6
Memory used 2,396,896k (± 0.01%) 2,397,995k (± 0.02%) +1,099k (+ 0.05%) 2,397,126k 2,398,511k p=0.008 n=6
Parse Time 5.25s (± 1.17%) 5.15s (± 0.75%) -0.09s (- 1.78%) 5.10s 5.20s p=0.016 n=6
Bind Time 1.70s (± 1.33%) 1.69s (± 1.71%) ~ 1.64s 1.73s p=0.870 n=6
Check Time 34.43s (± 0.21%) 34.41s (± 0.17%) ~ 34.34s 34.47s p=0.471 n=6
Emit Time 2.63s (± 0.83%) 2.65s (± 1.54%) ~ 2.60s 2.70s p=0.575 n=6
Total Time 44.02s (± 0.23%) 43.91s (± 0.22%) ~ 43.81s 44.04s p=0.199 n=6
self-compiler - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 256,196 256,224 +28 (+ 0.01%) ~ ~ p=0.001 n=6
Types 103,640 103,656 +16 (+ 0.02%) ~ ~ p=0.001 n=6
Memory used 424,190k (± 0.01%) 424,200k (± 0.01%) ~ 424,167k 424,242k p=0.471 n=6
Parse Time 3.49s (± 1.55%) 3.33s (± 0.35%) 🟩-0.16s (- 4.72%) 3.31s 3.34s p=0.005 n=6
Bind Time 1.30s (± 0.90%) 1.30s (± 0.84%) ~ 1.29s 1.32s p=0.865 n=6
Check Time 18.21s (± 0.37%) 18.18s (± 0.39%) ~ 18.09s 18.27s p=0.687 n=6
Emit Time 1.36s (± 1.28%) 1.38s (± 0.89%) ~ 1.36s 1.39s p=0.241 n=6
Total Time 24.37s (± 0.31%) 24.18s (± 0.32%) -0.18s (- 0.76%) 24.06s 24.25s p=0.005 n=6
ts-pre-modules - node (v18.15.0, x64)
Errors 35 35 ~ ~ ~ p=1.000 n=6
Symbols 224,824 224,824 ~ ~ ~ p=1.000 n=6
Types 93,390 93,390 ~ ~ ~ p=1.000 n=6
Memory used 369,410k (± 0.05%) 369,296k (± 0.00%) -114k (- 0.03%) 369,279k 369,315k p=0.037 n=6
Parse Time 3.67s (± 0.38%) 3.53s (± 1.05%) 🟩-0.14s (- 3.91%) 3.49s 3.59s p=0.005 n=6
Bind Time 1.95s (± 1.41%) 1.94s (± 0.78%) ~ 1.92s 1.96s p=0.413 n=6
Check Time 19.48s (± 0.41%) 19.44s (± 0.24%) ~ 19.36s 19.49s p=0.195 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 25.10s (± 0.30%) 24.90s (± 0.21%) -0.20s (- 0.80%) 24.83s 24.97s p=0.005 n=6
vscode - node (v18.15.0, x64)
Errors 4 4 ~ ~ ~ p=1.000 n=6
Symbols 2,797,149 2,797,149 ~ ~ ~ p=1.000 n=6
Types 950,053 950,053 ~ ~ ~ p=1.000 n=6
Memory used 2,925,215k (± 0.00%) 2,925,305k (± 0.00%) ~ 2,925,155k 2,925,515k p=0.575 n=6
Parse Time 13.56s (± 0.28%) 13.32s (± 0.30%) -0.24s (- 1.79%) 13.26s 13.38s p=0.005 n=6
Bind Time 4.12s (± 2.10%) 4.09s (± 0.25%) ~ 4.07s 4.10s p=0.615 n=6
Check Time 73.08s (± 0.39%) 73.11s (± 0.48%) ~ 72.52s 73.49s p=0.689 n=6
Emit Time 21.41s (± 9.45%) 21.03s (± 8.39%) ~ 19.61s 23.52s p=0.810 n=6
Total Time 112.17s (± 1.88%) 111.54s (± 1.59%) ~ 110.03s 114.21s p=0.575 n=6
webpack - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 265,853 265,853 ~ ~ ~ p=1.000 n=6
Types 108,438 108,438 ~ ~ ~ p=1.000 n=6
Memory used 410,385k (± 0.02%) 410,331k (± 0.01%) ~ 410,276k 410,388k p=0.109 n=6
Parse Time 3.95s (± 0.56%) 3.84s (± 0.61%) -0.11s (- 2.74%) 3.80s 3.87s p=0.005 n=6
Bind Time 1.69s (± 0.79%) 1.67s (± 0.63%) -0.02s (- 1.38%) 1.65s 1.68s p=0.017 n=6
Check Time 17.05s (± 0.41%) 17.06s (± 0.30%) ~ 16.97s 17.10s p=0.936 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 22.69s (± 0.36%) 22.57s (± 0.33%) -0.12s (- 0.54%) 22.43s 22.64s p=0.045 n=6
xstate-main - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 523,981 523,981 ~ ~ ~ p=1.000 n=6
Types 178,708 178,708 ~ ~ ~ p=1.000 n=6
Memory used 461,203k (± 0.01%) 461,166k (± 0.01%) ~ 461,120k 461,206k p=0.298 n=6
Parse Time 2.69s (± 0.61%) 2.61s (± 0.56%) -0.08s (- 2.92%) 2.59s 2.63s p=0.005 n=6
Bind Time 0.98s (± 0.64%) 0.99s (± 0.82%) +0.01s (+ 1.36%) 0.98s 1.00s p=0.023 n=6
Check Time 15.47s (± 0.36%) 15.31s (± 0.26%) -0.16s (- 1.02%) 15.26s 15.36s p=0.005 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 19.14s (± 0.24%) 18.91s (± 0.19%) -0.23s (- 1.18%) 18.87s 18.97s p=0.005 n=6
System info unknown
Hosts
  • node (v18.15.0, x64)
Scenarios
  • Compiler-Unions - node (v18.15.0, x64)
  • angular-1 - node (v18.15.0, x64)
  • mui-docs - node (v18.15.0, x64)
  • self-build-src - node (v18.15.0, x64)
  • self-build-src-public-api - node (v18.15.0, x64)
  • self-compiler - node (v18.15.0, x64)
  • ts-pre-modules - node (v18.15.0, x64)
  • vscode - node (v18.15.0, x64)
  • webpack - node (v18.15.0, x64)
  • xstate-main - node (v18.15.0, x64)
Benchmark Name Iterations
Current pr 6
Baseline baseline 6

tsserver

Comparison Report - baseline..pr
Metric baseline pr Delta Best Worst p-value
Compiler-UnionsTSServer - node (v18.15.0, x64)
Req 1 - updateOpen 3,504ms (± 0.33%) 3,421ms (± 0.43%) -83ms (- 2.36%) 3,399ms 3,436ms p=0.005 n=6
Req 2 - geterr 7,555ms (± 0.59%) 7,546ms (± 0.55%) ~ 7,478ms 7,603ms p=0.575 n=6
Req 3 - references 439ms (± 1.96%) 428ms (± 1.03%) -12ms (- 2.62%) 423ms 436ms p=0.016 n=6
Req 4 - navto 336ms (± 0.36%) 337ms (± 0.75%) ~ 334ms 341ms p=0.803 n=6
Req 5 - completionInfo count 1,357 1,357 ~ ~ ~ p=1.000 n=6
Req 5 - completionInfo 112ms (± 1.04%) 116ms (± 8.29%) ~ 111ms 135ms p=0.867 n=6
CompilerTSServer - node (v18.15.0, x64)
Req 1 - updateOpen 3,701ms (± 1.21%) 3,581ms (± 0.38%) 🟩-120ms (- 3.25%) 3,556ms 3,593ms p=0.005 n=6
Req 2 - geterr 5,702ms (± 0.16%) 5,685ms (± 0.40%) ~ 5,652ms 5,715ms p=0.128 n=6
Req 3 - references 445ms (± 0.46%) 443ms (± 0.46%) ~ 441ms 447ms p=0.145 n=6
Req 4 - navto 341ms (± 1.35%) 340ms (± 1.31%) ~ 333ms 346ms p=1.000 n=6
Req 5 - completionInfo count 1,519 1,519 ~ ~ ~ p=1.000 n=6
Req 5 - completionInfo 111ms (± 6.02%) 109ms (± 0.90%) ~ 107ms 110ms p=0.929 n=6
xstate-main-1-tsserver - node (v18.15.0, x64)
Req 1 - updateOpen 6,228ms (± 0.67%) 7,346ms (± 8.35%) ~ 6,095ms 7,632ms p=0.066 n=6
Req 2 - geterr 1,633ms (± 8.49%) 1,627ms (± 8.59%) ~ 1,342ms 1,695ms p=0.810 n=6
Req 3 - references 127ms (± 3.66%) 123ms (± 9.38%) ~ 100ms 133ms p=0.934 n=6
Req 4 - navto 588ms (± 2.73%) 584ms (± 2.54%) ~ 573ms 609ms p=0.871 n=6
Req 5 - completionInfo count 3,413 3,413 ~ ~ ~ p=1.000 n=6
Req 5 - completionInfo 1,273ms (± 1.81%) 1,261ms (± 1.97%) ~ 1,228ms 1,304ms p=0.423 n=6
System info unknown
Hosts
  • node (v18.15.0, x64)
Scenarios
  • CompilerTSServer - node (v18.15.0, x64)
  • Compiler-UnionsTSServer - node (v18.15.0, x64)
  • xstate-main-1-tsserver - node (v18.15.0, x64)
Benchmark Name Iterations
Current pr 6
Baseline baseline 6

startup

Comparison Report - baseline..pr
Metric baseline pr Delta Best Worst p-value
tsc-startup - node (v18.15.0, x64)
Execution time 227.45ms (± 0.18%) 227.31ms (± 0.17%) -0.14ms (- 0.06%) 225.53ms 231.35ms p=0.001 n=600
tsserver-startup - node (v18.15.0, x64)
Execution time 359.11ms (± 0.27%) 359.09ms (± 0.28%) ~ 350.85ms 367.65ms p=0.655 n=600
tsserverlibrary-startup - node (v18.15.0, x64)
Execution time 287.26ms (± 0.28%) 287.27ms (± 0.32%) ~ 280.38ms 306.23ms p=0.291 n=600
typescript-startup - node (v18.15.0, x64)
Execution time 286.91ms (± 0.30%) 286.91ms (± 0.28%) ~ 280.02ms 293.59ms p=0.230 n=600
System info unknown
Hosts
  • node (v18.15.0, x64)
Scenarios
  • tsc-startup - node (v18.15.0, x64)
  • tsserver-startup - node (v18.15.0, x64)
  • tsserverlibrary-startup - node (v18.15.0, x64)
  • typescript-startup - node (v18.15.0, x64)
Benchmark Name Iterations
Current pr 6
Baseline baseline 6

Developer Information:

Download Benchmarks

@rbuckton rbuckton marked this pull request as ready for review April 27, 2024 01:48
@rbuckton
Copy link
Member Author

@typescript-bot run dt
@typescript-bot test top400
@typescript-bot test tsserver top100
@typescript-bot user test this
@typescript-bot user test tsserver

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 27, 2024

Starting jobs; this comment will be updated as builds start and complete.

Command Status Results
run dt ✅ Started ✅ Results
test top400 ✅ Started ✅ Results
test tsserver top100 ✅ Started
user test this ✅ Started ✅ Results
user test tsserver ✅ Started ✅ Results

@typescript-bot
Copy link
Collaborator

@rbuckton Here are the results of running the user tests comparing main and refs/pull/58339/merge:

Everything looks good!

@typescript-bot
Copy link
Collaborator

Hey @rbuckton, the results of running the DT tests are ready.

Everything looks the same!

You can check the log here.

@typescript-bot
Copy link
Collaborator

@rbuckton Here are the results of running the user tests comparing main and refs/pull/58339/merge:

Everything looks good!

src/compiler/scanner.ts Outdated Show resolved Hide resolved
@typescript-bot
Copy link
Collaborator

@rbuckton Here are the results of running the top 400 repos comparing main and refs/pull/58339/merge:

Everything looks good!

@jakebailey
Copy link
Member

The perf seels really good. Is that all from avoiding out of bounds?

@rbuckton
Copy link
Member Author

The perf seels really good. Is that all from avoiding out of bounds?

Yeah, if charCodeAt is called for an out of bounds index it returns NaN, which isn't an SMI. IIRC, accessing a string out of bounds also results in sub optional compilation as it requires a recompile to introduce additional checks in the compiled code.

Co-authored-by: Jake Bailey <5341706+jakebailey@users.noreply.github.com>
@DanielRosenwasser
Copy link
Member

Might be worth mentioning in the PR description that you also var-ified the regular expression parser (#52924).

@DanielRosenwasser
Copy link
Member

Also other specific optimizations - like hoisting out helper functions and lazily initializing state for captures.

Copy link
Member

@jakebailey jakebailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, forgot my green check.

@rbuckton
Copy link
Member Author

I've been working on expanding this throughout scanner.ts and writing up a lint rule to ensure enforcement, so I'll probably need another review shortly.

@rbuckton
Copy link
Member Author

I've been working on expanding this throughout scanner.ts and writing up a lint rule to ensure enforcement, so I'll probably need another review shortly.

Since I have some additional testing to do, I think I'll put those changes up as a separate PR following this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants