Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does ripgrep traverse into directories that don't match the provided glob? #1304

Closed
oliversalzburg opened this issue Jun 17, 2019 · 2 comments
Labels
duplicate An issue that is duplicative of another. enhancement An enhancement to the functionality of the software.

Comments

@oliversalzburg
Copy link

What version of ripgrep are you using?

ripgrep 11.0.1 (rev 973de50c9e)
-SIMD -AVX (compiled)

How did you install ripgrep?

Downloaded the binary from the GitHub Releases page.

What operating system are you using ripgrep on?

Windows 10 x64

Describe your question, feature request, or bug.

Through an issue I was experiencing with an extension for VS Code, I looked at the behavior of ripgrep, given the underlying command that was causing the excessive load.

The relevant parts of the command are:

rg --files --follow --no-ignore -g /test/**/*.js

This will quickly produce the desired file names from the test subfolder in the current working directory and then it will print out huge chains of folder names (resulting from symlinks) in a folder named node_modules in the current working directory.

Now my question is, why does ripgrep even traverse down into the node_modules folder, when it doesn't match the requested glob anyway? Is that by-design or is that unintended behavior?

@BurntSushi
Copy link
Owner

Now my question is, why does ripgrep even traverse down into the node_modules folder, when it doesn't match the requested glob anyway? Is that by-design or is that unintended behavior?

Neither. It's simply a missed performance optimization. While it's obvious to us humans that looking in node_modules is a waste of time, ripgrep doesn't know that because it doesn't do the sophisticated analysis on globs required to implement the optimization you want here. In particular, ripgrep must descend into directories when given a glob like this unless the glob specifically excludes that directory, otherwise ripgrep might miss files. For example, the path /test does not match the glob /test/**/*.js. So in order to implement this optimization, ripgrep would need to reason about glob prefixes and the like.

Moreover, the command you've presented is fairly idiosyncratic. I don't understand why you're using in a glob for this instead of just using your shell's glob expansion to search the test directory directly. e.g., rg --files --follow --no-ignore /test/**/*.js. If your shell isn't capable of that, then try rg --files --follow --no-ignore /test -g '*.js'.

Invariably, this is a duplicate of #546, so I'm going to close this one as well for the same reason.

@BurntSushi BurntSushi added duplicate An issue that is duplicative of another. enhancement An enhancement to the functionality of the software. labels Jun 17, 2019
@oliversalzburg
Copy link
Author

oliversalzburg commented Jun 17, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate An issue that is duplicative of another. enhancement An enhancement to the functionality of the software.
Projects
None yet
Development

No branches or pull requests

2 participants