Parse paths and identifiers only once #840

Kijewski · 2023-07-14T10:54:26Z

In the old implementation each variable in an expression would be parsed up to three times:

Try to parse a path because it contains a leading double colon, or infix double colons.
Try to parse it as path again by scanning for an identifier that contains an upper case character.
Fall back to scanning for any identifier.

This PR turns all three steps into one, without the need for backtracking.

djc

Nice! Did you benchmark the effect?

djc · 2023-07-21T13:41:55Z

askama_derive/src/parser/mod.rs

-
-            // If `identifier()` fails then just return the original error
-            Err(err)
+        (None, name, []) if !name.contains(char::is_uppercase) => {


What tests break if we leave out this arm? Maybe move the comment to be directly before this arm, which seems to be the main thing that's affected?

This arm tells variable names (no upper case characters) and struct names (at least one upper case character) apart. I guess every test will fail without it. :)

askama_derive/src/parser/expr.rs

Kijewski · 2023-07-21T14:33:43Z

As benchmark I used cd testing; find -name '*.rs' -exec touch '{}' '+'; time cargo b --tests (best out of three runs):

       real     user    sys
old  1.623s  10.442s  2.235s
new  1.473s  10.483s  2.272s

I cannot say why the real time got so much better, if the user time got worse? All CPU cores only have around 33°C, so they should all be utilized in my test without thermal throttling. I guess I should try to find a better benchmark, but I guess without #834 we cannot properly benchmark the parser. :-/ I'll try to review your PR this weekend!

djc · 2023-08-02T09:09:35Z

(Feel free to just merge this after rebasing.)

In the old implementation each variable in an expression would be parsed up to three times: * Try to parse a path because it contains a leading double colon, or infix double colons. * Try to parse it as path again by scanning for an identifier that contains an upper case character. * Fall back to scanning for any identifier. This PR turns all three steps into one, without the need for backtracking.

Kijewski · 2023-08-02T22:00:55Z

Not a huge difference, but it's a little better:

100 bytes               time:   [11.475 µs 11.594 µs 11.694 µs]
                        change: [-6.8066% -4.5471% -2.2145%] (p = 0.00 < 0.05)
                        Performance has improved.

1000 bytes              time:   [62.269 µs 62.421 µs 62.555 µs]
                        change: [-4.8183% -4.1854% -3.5724%] (p = 0.00 < 0.05)
                        Performance has improved.

10000 bytes             time:   [536.05 µs 536.83 µs 537.69 µs]
                        change: [-3.2005% -2.9935% -2.7835%] (p = 0.00 < 0.05)
                        Performance has improved.

100000 bytes            time:   [4.9944 ms 4.9996 ms 5.0053 ms]
                        change: [-2.8939% -2.7114% -2.5330%] (p = 0.00 < 0.05)
                        Performance has improved.

djc approved these changes Jul 21, 2023

View reviewed changes

Kijewski and others added 2 commits August 2, 2023 23:49

Optimize parsing boolean literals, too

1c5f7dd

Kijewski merged commit d4fbad1 into djc:main Aug 2, 2023
18 checks passed

Kijewski deleted the pr-parse-var-once branch August 2, 2023 22:02

djc mentioned this pull request Jan 18, 2024

Fix support for mixed case variables #951

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse paths and identifiers only once #840

Parse paths and identifiers only once #840

Kijewski commented Jul 14, 2023

djc left a comment

djc Jul 21, 2023

Kijewski Jul 21, 2023

Kijewski commented Jul 21, 2023 •

edited

Loading

djc commented Aug 2, 2023

Kijewski commented Aug 2, 2023

Parse paths and identifiers only once #840

Parse paths and identifiers only once #840

Conversation

Kijewski commented Jul 14, 2023

djc left a comment

Choose a reason for hiding this comment

djc Jul 21, 2023

Choose a reason for hiding this comment

Kijewski Jul 21, 2023

Choose a reason for hiding this comment

Kijewski commented Jul 21, 2023 • edited Loading

djc commented Aug 2, 2023

Kijewski commented Aug 2, 2023

Kijewski commented Jul 21, 2023 •

edited

Loading