-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libripgrep: PCRE2 support, multiline search, JSON output and more #1017
Conversation
libripgrep is not any one library, but rather, a collection of libraries that roughly separate the following key distinct phases in a grep implementation: 1. Pattern matching (e.g., by a regex engine). 2. Searching a file using a pattern matcher. 3. Printing results. Ultimately, both (1) and (3) are defined by de-coupled interfaces, of which there may be multiple implementations. Namely, (1) is satisfied by the `Matcher` trait in the `grep-matcher` crate and (3) is satisfied by the `Sink` trait in the `grep2` crate. The searcher (2) ties everything together and finds results using a matcher and reports those results using a `Sink` implementation. Closes #162
This commit does the work to delete the old `grep` crate and effectively rewrite most of ripgrep core to use the new libripgrep crates. The new `grep` crate is now a facade that collects the various crates that make up libripgrep. The most complex part of ripgrep core is now arguably the translation between command line parameters and the library options, which is ultimately where we want to be.
This basically rewrites every integration test. We reduce the amount of magic involved here in terms of which arguments are being passed to ripgrep processes. To make up for the boiler plate saved by the magic, we make the Dir (formerly WorkDir) type a bit nicer to use, along with a new TestCommand that wraps a std::process::Command. In exchange, we get tests that are easier to read and write. We also run every test with the `--pcre2` flag to make sure that works, when PCRE2 is available.
This commit updates the CHANGELOG to reflect all the work done to make libripgrep a reality. * Closes #162 (libripgrep) * Closes #176 (multiline search) * Closes #188 (opt-in PCRE2 support) * Closes #244 (JSON output) * Closes #416 (Windows CRLF support) * Closes #917 (trim prefix whitespace) * Closes #993 (add --null-data flag) * Closes #997 (--passthru works with --replace) * Fixes #2 (memory maps and context handling work) * Fixes #200 (ripgrep stops when pipe is closed) * Fixes #389 (more intuitive `-w/--word-regexp`) * Fixes #643 (detection of stdin on Windows is better) * Fixes #441, Fixes #690, Fixes #980 (empty matching lines are weird) * Fixes #764 (coalesce color escapes) * Fixes #922 (memory maps failing is no big deal) * Fixes #937 (color escapes no longer used for empty matches) * Fixes #940 (--passthru does not impact exit status) * Fixes #1013 (show runtime CPU features in --version output)
a9e999e
to
52d7130
Compare
This is huge and amazing, thank you! I was wondering whether a I also had some feedback on the zsh completion — mostly just nit-picking, but a few more functional things (like exclusivity errors). Rather than flood you with a bunch of comments, i think i'll prepare a patch that can be applied against your branch. One particular issue i do see with the completion is that, since We could work around that by giving Let me know how you feel about I only looked at a few other things but so far everything else works great for me. |
Something like this? https://github.com/BurntSushi/ripgrep/compare/ag/libripgrep...okdana:dana/libripgrep?expand=1 (raw diff, raw diff minus
PS: Treating passed-through lines as context is so much smarter than the old method (and totally obvious in hind sight), i like that a lot |
This is awesome. I know a lot of people will be very happy about being able to search with PCRE2 in vscode! |
Thanks for the comments @okdana!
So my thinking here is that But, it doesn't cost much to add the flag and maybe it's worth it for consistency reasons. We don't yet do it for every flag though I don't think, but it's probably pretty close.
That's great, thanks! Would it work for you if I merged this PR and then you just submit the patch against master? I'm not planning on doing a release imminently or anything. Also, in general, I've found it pretty difficult to update the completion script. I knew I probably got some stuff wrong and I'm grateful you've caught it. :-) I can't tell whether a little bit of documentation describing the syntax would fix my problem editing the completion rules, or if it would be better to try to generate the completion script like we generate the man page (potentially by adding addition metadata on
Yeah, I'm not sure what I want to do here. You're right that making
ripgrep does kind of have at least the guts to make this happen now I think. This was in large part motivated by my desire to generate a quality man page directly from the flag definitions. This is why
Hah yeah. It also fixes the exit status bug nicely. It would not have been feasible to do in the old ripgrep code though! @okdana Your patch LGTM, thank you. :-) I'm fine with taking in |
Nice work and big thanks from me too! I've been using libripgrep in my own app, results are much better then I what I would have come up with by myself. It's also introduced me to the world of rust and having full control over memory and everything else. It's been and adventure, mostly good :). |
Yeah, i can do that.
I worry that, unless it was extremely comprehensive, any automatic generation done by ripgrep would have similar limitations to the ones Clap had. Obviously it can be done, but i don't know, i have mixed feelings i guess. Short of that, maybe it would help if i put together a little Markdown guide to explain in simple terms how the option specifications work, and what the conventions are? I think i could make something that's easier to follow than the official documentation, at least for the sub-set of features that most of our specs use. Then we could put that in the repo and link to it at the top of the function. |
@okdana A small guide would be great! I think it would be nice to just inline it into a comment in the completion script though, just so that it's all in one place.
I'm not too familiar with the limitations that Clap had, but yeah, that's why I would expect we'd need to add additional structure to Either way, I suspect a mini-guide describing how the completion script works is an excellent next step. Thank you. :-) |
This is a bit off topic...
I'm surprised we are still using custom completion scripts for each command and each shell... Back when Bash didn't yet have support for user completion rules I had my own Bash fork which gave completions and option help when I typed To make things a bit less heuristic, each command could support some option, e.g. Maybe the Rust community can get a convention started? |
@kankri I think it would be better if you sought out the Rust CLI working group: https://github.com/rust-lang-nursery/cli-wg There hasn't been too much discussion on completions there, so you might be able to get the ball rolling! https://github.com/rust-lang-nursery/cli-wg/search?q=completions&type=Issues I personally don't have any plans to output ripgrep's options in a machine readable format. I do still think it's possible to generate completions with the same level of quality as our current manually curated completions, but this will probably require attaching more metadata to each flag, so it requires some work and I haven't thought through it completely yet. @okdana's excellent docs have honestly reduced my desire for this since I think I can now competently edit the completion file. Otherwise, I don't really see any good reason to be "surprised." It's not surprising at all that good context sensitive completions require a lot more knowledge than what you typically see in a command's |
After a lot of work, libripgrep is finally read to be merged to master. This includes planned features such as multiline search and JSON output, but also a surprise: ripgrep now provides the ability to opt into using PCRE2 as the regex engine instead of Rust's default regex engine. This provides a way to use look-around and backreferences with ripgrep. This support should work on Windows, macOS and Linux.
libripgrep still needs high level documentation in the form of a cookbook and a guide, so it is not quite ready for wide use yet. However, the API itself is minimally documented, so ambitious individuals should feel empowered to try it out. :-)
Closes #162 (libripgrep)
Closes #176 (multiline search)
Closes #188 (opt-in PCRE2 support)
Closes #244 (JSON output)
Closes #416 (Windows CRLF support)
Closes #917 (trim prefix whitespace)
Closes #993 (add --null-data flag)
Closes #997 (--passthru works with --replace)
Fixes #2 (memory maps and context handling work)
Fixes #200 (ripgrep stops when pipe is closed)
Fixes #389 (more intuitive
-w/--word-regexp
)Fixes #643 (detection of stdin on Windows is better)
Fixes #441, Fixes #690, Fixes #980 (empty matching lines are weird)
Fixes #764 (coalesce color escapes)
Fixes #922 (memory maps failing is no big deal)
Fixes #937 (color escapes no longer used for empty matches)
Fixes #940 (--passthru does not impact exit status)
Fixes #1013 (show runtime CPU features in --version output)
cc @roblourens