Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rg 11.0.0 major performance regression #1268

Closed
ulchie opened this issue Apr 24, 2019 · 5 comments
Closed

rg 11.0.0 major performance regression #1268

ulchie opened this issue Apr 24, 2019 · 5 comments
Labels
bug A bug.

Comments

@ulchie
Copy link

ulchie commented Apr 24, 2019

What version of ripgrep are you using?

$ /local/rulch/rg10/ripgrep-0.10.0-x86_64-unknown-linux-musl/rg --version
ripgrep 0.10.0 (rev 8a7db1a)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

vs

$ /local/rulch/rg1100/ripgrep-11.0.0-x86_64-unknown-linux-musl/rg --version
ripgrep 11.0.0 (rev d7f57d9)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

How did you install ripgrep?

Downloaded from the prebuilt archives.

What operating system are you using ripgrep on?

Ubuntu 16:
$ uname -a
Linux #hostname_elided# 4.4.0-141-generic #167-Ubuntu SMP Wed Dec 5 10:40:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

This also appears to reproduce on ubuntu 12 as well. I found this problem while introducing ripgrep to a coworker and noticed how it was operating slowly (relative to what I'm used to) on his machine. Then I downloaded it to mine and noticed the same performance difference.

Describe your question, feature request, or bug.

I'm seeing a ~5x slowdown for identical searches on identical files (results repeatable).

If this is a bug, what are the steps to reproduce the behavior?

I cannot include the corpus of my own work repository so I tried with the Open Office repo where I see a slowdown of 3-4x:
$ time /local/rulch/rg1100/ripgrep-11.0.0-x86_64-unknown-linux-musl/rg --debug okay > out-new 2>&1

real 0m0.808s
user 0m4.724s
sys 0m3.192s

$ time /local/rulch/rg10/ripgrep-0.10.0-x86_64-unknown-linux-musl/rg --debug okay > out-old 2>&1

real 0m0.252s
user 0m1.304s
sys 0m1.328s

Open Office Repo: https://github.com/apache/openoffice @ commit 633ece0517

Please see the attachments for the new and old output with --debug provided.
out-new.txt
out-old.txt

In the Open Office repository it's not all that significant in terms of absolute time, but in my repository at work it causes searches to take a few seconds as opposed to half a second (at least for a small'ish number of results) on a 24 core machine. With less powerful CPUs I can see this being pretty impactful. It's annoying but still faster than grep. Is there some default behavior that changed from 0.10 to 11.0.0 that could explain this?

@BurntSushi BurntSushi added the bug A bug. label Apr 24, 2019
@BurntSushi
Copy link
Owner

Oof. Great bug and thanks for the report. I can indeed reproduce this:

[andrew@blackfoot openoffice]$ time /tmp/ripgrep-release/ripgrep-0.10.0-x86_64-unknown-linux-musl/rg --no-config okay -q --stats

99 matches
99 matched lines
81 files contained matches
62261 files searched
0 bytes printed
1517548317 bytes searched
1.194573 seconds spent searching
0.185614 seconds

real    0.190
user    1.147
sys     0.946
maxmem  20 MB
faults  0
[andrew@blackfoot openoffice]$ time /tmp/ripgrep-release/ripgrep-11.0.1-x86_64-unknown-linux-musl/rg --no-config okay -q --stats

99 matches
99 matched lines
81 files contained matches
62261 files searched
0 bytes printed
1517548260 bytes searched
0.791702 seconds spent searching
0.729965 seconds

real    0.733
user    7.369
sys     1.069
maxmem  14 MB
faults  0

To make things weirder, if I use my compiled version of ripgrep, then I don't get the regression:

[andrew@blackfoot openoffice]$ time rg --no-config okay -q --stats

99 matches
99 matched lines
81 files contained matches
62261 files searched
0 bytes printed
1517548260 bytes searched
0.980003 seconds spent searching
0.187177 seconds

real    0.193
user    0.914
sys     1.080
maxmem  18 MB
faults  0

If it's not too much trouble, could you try building ripgrep from master and seeing if you still get the performance regression?

The only real difference between master and the 11.0.1 binary releases on Linux is that the binary releases link against musl for libc instead of glibc (which is what will happen by default when compiling master).

@ulchie
Copy link
Author

ulchie commented Apr 24, 2019

With the build from master, there is no performance regression:

$ time ../rg11/ripgrep-11.0.1-x86_64-unknown-linux-musl/rg okay > /dev/null

real 0m0.835s
user 0m4.840s
sys 0m3.264s

$ time ../rg10/ripgrep-0.10.0-x86_64-unknown-linux-musl/rg okay > /dev/null

real 0m0.217s
user 0m1.224s
sys 0m1.220s

(Built rg from master and threw on PATH)
$ time rg okay > /dev/null

real 0m0.185s
user 0m0.984s
sys 0m1.032s

$ rg --version
ripgrep 11.0.1 (rev e7829c0)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

@BurntSushi
Copy link
Owner

Thanks again for the bug report! It turns out that musl's allocator was slowing ripgrep down quite a bit. i.e., Building ripgrep 0.10.0 with a newer version of Rust with musl exhibits the same performance regression, and profiling shows a lot of time being spent in musl's allocator. This popped up as a regression in ripgrep 11 because Rust recently stopped bundling jemalloc by default, and is instead now using the system allocator by default.

I've pushed a fix which goes back to using jemalloc when building ripgrep with musl. I'll consider putting out a patch release at some point soon, since this is a pretty bad performance regression.

Thanks so much for noticing and for providing a reproducible bug report!

@ulchie
Copy link
Author

ulchie commented Apr 24, 2019

No problem! Thanks for being so quick to respond, and for the awesome tool. I use rg many times every day.

@BurntSushi
Copy link
Owner

Looks like something is broken with musl + jemalloc + Ubuntu 16.04. I've filed a bug upstream here: gnzlbg/jemallocator#124

That will block a release with this fix unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug.
Projects
None yet
Development

No branches or pull requests

2 participants