Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coalesce adjacent loops in concatenation RegexNodes #1838

Merged
merged 2 commits into from
Jan 17, 2020

Commits on Jan 17, 2020

  1. Coalesce adjacent loops in concatenation RegexNodes

    This augments the reduction phase of concatenation nodes to combine adjacent one/notone/setloops, e.g. `a*a+a{1,2}b` becomes `a{2,}b` (previously added optimizations will then see that the a loop can be made atomic and replace it with the equivalent of `(?>a{2,})b`).  This has several benefits.  First, it simplifies the node tree, creating less work for IR writer and less work for the interpreter/compiler.  Second, it gives the compiler more opportunity to choose how the loop should be represented, when and how to unroll, etc.  Third, it enables the auto-atomicity step to apply to more loops (as in the previous example).  And most importantly, it can drastically reduce backtracking (especially with the atomicity optimization, but even without that).  An expression like `a*a*a*a*a*a*b` run against an input like `aaaaaaaaaaaaaa` could previously take a very long time; now, it'll be very fast.
    stephentoub committed Jan 17, 2020
    Configuration menu
    Copy the full SHA
    f24fadf View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7da39ac View commit details
    Browse the repository at this point in the history