Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(parser): optimize the most common lexer matches into lookup tables #814

Merged
merged 4 commits into from
Jan 31, 2024

Conversation

allancalix
Copy link
Contributor

Overview

Basically just bastardizes the good ideas in https://github.com/maciejhirsz/logos in a modest change trying to get the most value without making the code unreadable or hard to maintain. Both functions internally call is_ascii to prevent overflows since we're dealing with utf-8.

There's obviously more you could do but from my profiles this seems to be the most impactful paths without making the whole thing unreadable.

Feel free to close as there are obvious tradeoffs here but I opened for your consideration.

Impact

The impact of this change really depends on the document being parsed. For small documents its negligible but for large documents, particularly schema SDLs the change speeds things up to ~18%.

test result: ok. 0 passed; 0 failed; 65 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/query.rs (/Users/allancalix/src/github.com/allancalix/apollo-rs/target/release/deps/query-008ff12600681735)
Benchmarking many_aliases: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.8s, enable flat sampling, or reduce sample count to 50.
many_aliases            time:   [1.9430 ms 1.9473 ms 1.9521 ms]
                        change: [-2.1904% -1.8069% -1.4314%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

query_lexer             time:   [1.1066 µs 1.1090 µs 1.1115 µs]
                        change: [-6.5579% -6.1115% -5.7380%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

query_parser            time:   [7.4054 µs 7.4149 µs 7.4241 µs]
                        change: [-13.829% -10.061% -6.8751%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

     Running benches/supergraph.rs (/Users/allancalix/src/github.com/allancalix/apollo-rs/target/release/deps/supergraph-4c8cf367fca5ad96)
supergraph_lexer        time:   [47.780 µs 47.864 µs 47.956 µs]
                        change: [-5.9375% -5.6993% -5.4348%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

supergraph_parser       time:   [201.49 µs 201.96 µs 202.47 µs]
                        change: [-1.8449% -1.4664% -1.0951%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

The delta on the public github schema was quite sizable:

apollo - parse github schema
                        time:   [4.8513 ms 4.8560 ms 4.8611 ms]
                        change: [-18.862% -18.703% -18.548%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

@@ -17,7 +17,7 @@ impl<'a> Token<'a> {
}

/// Get a reference to the token's data.
pub fn data(&self) -> &str {
pub fn data(&self) -> &'a str {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot about this, just a note here that this lifetime was always bound to the source document and not the token itself - it's fine to hold the data as long as the source is still around.

I'm not sure if this is a breaking change since it's technically more permissive than it was before but just fyi

Copy link
Member

@goto-bus-stop goto-bus-stop Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot! If we thread this change through other parts of the code I think we can actually get rid of some intermediate allocations in the parser. I'll submit that in a follow-up PR.

Copy link
Member

@goto-bus-stop goto-bus-stop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea. I do kinda doubt the lexer is the most important part of apollo-rs to optimise (and currently big changes to improve performance are not our primary priority anyways), but a simple incremental improvement like this is always nice!

lut[b'w' as usize] = true;
lut[b'x' as usize] = true;
lut[b'y' as usize] = true;
lut[b'z' as usize] = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only ranged for was available in const fn 😁

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this could be a while loop

Copy link
Contributor

@Geal Geal Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this works:

const fn generate_table() -> [bool;256] {
    let mut table = [false;256];
    
    let mut i = b'a';
    while i <= b'z' + 1 {
        table[i as usize] = true;
        i += 1;
    }
    
    let mut i = b'A';
    while i <= b'Z' + 1 {
        table[i as usize] = true;
        i += 1;
    }
    
    table
}

Seems it's less obvious what's happening and can make the compiler hang if you make a mistake. const for would be perfect, but requires unstable features. Also we don't want to make it too easy to make the whole lexer do this, other token types are alot less frequent and benchmarking/profiling shows there was no benefit in those cases.

In fact in some cases (I forgot which) it seemed to actually make things worse

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree writing this stuff out is simpler to understand than a while loop.

@goto-bus-stop goto-bus-stop changed the title Optimizes the most common lexer matches into lookup tables chore(parser): optimize the most common lexer matches into lookup tables Jan 29, 2024
@goto-bus-stop goto-bus-stop merged commit 12db7e8 into apollographql:main Jan 31, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants