Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change occurrences of std::regex uses to RE2 library calls #4100

Merged
merged 3 commits into from
Aug 19, 2024
Merged

Conversation

mxwli
Copy link
Contributor

@mxwli mxwli commented Aug 16, 2024

Fixes #2933 and dramatically speeds up query compilation times.

Contributor agreement

Copy link

Benchmark Result

Master commit hash: 700b10f71c4e9ea19b4cd7e57ec864a38033225e
Branch commit hash: d99dcb2fcfafca5ee97741220a77d529b20276ac

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 686.32 668.53 17.79 (2.66%)
aggregation q28 10911.91 11953.42 -1041.51 (-8.71%)
filter q14 159.53 144.67 14.86 (10.28%)
filter q15 165.97 147.65 18.32 (12.41%)
filter q16 335.93 321.50 14.43 (4.49%)
filter q17 483.78 465.22 18.56 (3.99%)
filter q18 1928.20 1993.28 -65.08 (-3.26%)
fixed_size_expr_evaluator q07 577.59 555.62 21.97 (3.95%)
fixed_size_expr_evaluator q08 784.23 777.35 6.88 (0.89%)
fixed_size_expr_evaluator q09 787.43 778.60 8.83 (1.13%)
fixed_size_expr_evaluator q10 274.45 257.71 16.74 (6.49%)
fixed_size_expr_evaluator q11 269.13 251.64 17.49 (6.95%)
fixed_size_expr_evaluator q12 267.48 250.60 16.87 (6.73%)
fixed_size_expr_evaluator q13 1495.85 1484.57 11.28 (0.76%)
fixed_size_seq_scan q23 156.29 133.27 23.02 (17.27%)
join q31 12.19 11.39 0.81 (7.08%)
ldbc_snb_ic q35 756.21 771.29 -15.07 (-1.95%)
ldbc_snb_ic q36 45.73 48.19 -2.45 (-5.09%)
ldbc_snb_is q32 9.30 8.69 0.62 (7.10%)
ldbc_snb_is q33 17.74 15.30 2.44 (15.92%)
ldbc_snb_is q34 8.10 7.73 0.36 (4.71%)
multi-rel multi-rel-large-scan 2785.03 2881.53 -96.50 (-3.35%)
multi-rel multi-rel-lookup 49.54 66.20 -16.66 (-25.17%)
multi-rel multi-rel-small-scan 78.40 53.93 24.46 (45.36%)
order_by q25 167.35 148.91 18.44 (12.38%)
order_by q26 492.90 464.11 28.79 (6.20%)
order_by q27 1431.45 1422.50 8.95 (0.63%)
scan_after_filter q01 206.70 192.10 14.60 (7.60%)
scan_after_filter q02 195.30 182.94 12.35 (6.75%)
shortest_path_ldbc100 q39 90.51 97.14 -6.64 (-6.83%)
var_size_expr_evaluator q03 2094.03 2071.60 22.42 (1.08%)
var_size_expr_evaluator q04 2263.45 2263.46 -0.01 (-0.00%)
var_size_expr_evaluator q05 2734.67 2624.07 110.60 (4.21%)
var_size_expr_evaluator q06 1372.99 1337.83 35.16 (2.63%)
var_size_seq_scan q19 1506.13 1484.97 21.16 (1.42%)
var_size_seq_scan q20 3160.75 3194.70 -33.95 (-1.06%)
var_size_seq_scan q21 2438.87 2430.11 8.76 (0.36%)
var_size_seq_scan q22 134.95 133.10 1.86 (1.40%)

…really really long name so that windows can use the repository
Copy link

Benchmark Result

Master commit hash: 86792c1c710a333cb9d796d756f42ae800314db5
Branch commit hash: 3728ac0e384b2d67554f5263f3d8324ece9a7253

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 691.49 681.95 9.54 (1.40%)
aggregation q28 11938.23 11453.53 484.69 (4.23%)
filter q14 161.20 163.07 -1.87 (-1.15%)
filter q15 162.25 159.20 3.05 (1.92%)
filter q16 334.62 339.92 -5.29 (-1.56%)
filter q17 482.42 482.72 -0.29 (-0.06%)
filter q18 1987.70 1992.68 -4.98 (-0.25%)
fixed_size_expr_evaluator q07 582.88 579.90 2.98 (0.51%)
fixed_size_expr_evaluator q08 788.56 786.46 2.11 (0.27%)
fixed_size_expr_evaluator q09 787.84 797.97 -10.13 (-1.27%)
fixed_size_expr_evaluator q10 276.97 275.16 1.81 (0.66%)
fixed_size_expr_evaluator q11 271.96 269.65 2.31 (0.86%)
fixed_size_expr_evaluator q12 270.86 268.40 2.46 (0.91%)
fixed_size_expr_evaluator q13 1511.24 1503.07 8.17 (0.54%)
fixed_size_seq_scan q23 155.70 153.64 2.05 (1.34%)
join q31 12.65 12.40 0.25 (1.99%)
ldbc_snb_ic q35 749.24 855.87 -106.63 (-12.46%)
ldbc_snb_ic q36 47.46 47.34 0.12 (0.25%)
ldbc_snb_is q32 9.24 8.54 0.71 (8.26%)
ldbc_snb_is q33 16.96 17.45 -0.49 (-2.81%)
ldbc_snb_is q34 9.36 8.28 1.08 (13.09%)
multi-rel multi-rel-large-scan 2789.33 2818.27 -28.94 (-1.03%)
multi-rel multi-rel-lookup 76.04 66.14 9.91 (14.98%)
multi-rel multi-rel-small-scan 55.65 73.68 -18.02 (-24.46%)
order_by q25 167.54 163.73 3.81 (2.33%)
order_by q26 483.69 486.59 -2.90 (-0.60%)
order_by q27 1485.99 1445.55 40.45 (2.80%)
scan_after_filter q01 210.11 207.58 2.52 (1.21%)
scan_after_filter q02 196.37 196.42 -0.05 (-0.03%)
shortest_path_ldbc100 q39 115.97 116.31 -0.34 (-0.29%)
var_size_expr_evaluator q03 2095.78 2084.35 11.43 (0.55%)
var_size_expr_evaluator q04 2272.93 2293.15 -20.22 (-0.88%)
var_size_expr_evaluator q05 2646.18 2700.12 -53.95 (-2.00%)
var_size_expr_evaluator q06 1417.90 1411.63 6.27 (0.44%)
var_size_seq_scan q19 1502.96 1496.14 6.82 (0.46%)
var_size_seq_scan q20 3174.94 3170.00 4.94 (0.16%)
var_size_seq_scan q21 2453.28 2421.56 31.72 (1.31%)
var_size_seq_scan q22 137.94 138.55 -0.60 (-0.44%)

Copy link

codecov bot commented Aug 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.94%. Comparing base (9a97f75) to head (05716c0).
Report is 10 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4100      +/-   ##
==========================================
+ Coverage   83.90%   83.94%   +0.04%     
==========================================
  Files        1303     1305       +2     
  Lines       51387    51362      -25     
  Branches     7143     7139       -4     
==========================================
  Hits        43116    43116              
+ Misses       8125     8100      -25     
  Partials      146      146              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mxwli mxwli merged commit 87de7f9 into master Aug 19, 2024
23 checks passed
@mxwli mxwli deleted the glob-regex branch August 19, 2024 19:11
ted-wq-x pushed a commit to ted-wq-x/kuzu that referenced this pull request Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regex error: The complexity of an attempted match against a regular expression exceeded a pre-set level.
2 participants