-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ripgrep is slow when detecting whitespace literals, e.g., '[ \t]$'
#1087
Comments
Thanks for the easy reproduction! Much appreciated. This is unfortunately a case where ripgrep's literal optimizations end up making it slower rather than faster. The presence of literal optimizations virtually guarantees that cases like this exist, although we might consider some common sense heuristics to improve it. e.g., If all detected literals are just whitespace, then it's probably better not to use them. That would be a fairly easy change in You can test this by trying ripgrep with PCRE2. For example, |
'[ \t]$'
regex.'[ \t]$'
On Fri, 19 Oct 2018 03:58:27 -0700 Andrew Gallant ***@***.***> wrote:
This is unfortunately a case where ripgrep's literal optimizations end up
making it slower rather than faster. The presence of literal optimizations
virtually guarantees that cases like this exist, although we might consider
some common sense heuristics to improve it. e.g., If all detected literals
are just whitespace, then it's probably better not to use them. That would be
a fairly easy change in `grep-regex/src/literal.rs`.
You can test this by trying ripgrep with PCRE2. For example, `rg '[ \t]$' -P
-U --no-pcre2-unicode --mmap` runs faster on my machine than `ag '[ \t]$'`.
In particular, the flags given to ripgrep here enable the same performance
profile as ag, albeit with PCRE2 instead of PCRE1. Similarly, `rg '\s$'` runs
quite a bit faster than `ag '\s$'` (although, `rg -U '\s$'` is the more
accurate comparison, but still runs quite a bit faster).
thanks for the reply.
…--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
List of Networking Clients - http://shlom.in/net-clients
I achieved my fast times by multitudes of 1% reductions.
— Bill Raymond in
https://groups.yahoo.com/neo/groups/fc-solve-discuss/conversations/messages/222
Please reply to list if it's a mailing list post - http://shlom.in/reply .
|
If a literal is entirely whitespace, then it's quite likely that it is very common. So when that case occurs, just don't do (inner) literal optimizations at all. The regex engine may still make sub-optimal decisions here, but that's a problem for another day. Fixes #1087
Thanks for fixing it, @BurntSushi ! 👍 |
What version of ripgrep are you using?
ripgrep 0.10.0
-SIMD -AVX (compiled)
+SIMD -AVX (runtime)
How did you install ripgrep?
cargo install ripgrep
What operating system are you using ripgrep on?
My system is Mageia Linux v7 x86-64 on a Core i3 Sandy Bridge machine.
rg is much slower than ag when searching for the
'[ \t]$'
regex on asample repo. With this benchmark case:
I am getting the following:
My system is Mageia Linux v7 x86-64 on a Core i3 Sandy Bridge machine.
The text was updated successfully, but these errors were encountered: