-
-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Respect globs array order, fixes #25 #27
Conversation
ea2f8e2
to
a8a7011
Compare
|
||
if (positives.length === 0) throw new Error("Missing positive glob"); | ||
|
||
// only one positive glob no need to aggregate | ||
if (positives.length === 1) return gs.createStream(positives[0], negatives, opt); | ||
if (positives.length === 1) { | ||
return gs.createStream(positives[0].glob, negatives.filter(indexGreaterThan(positives[0].index)).map(toGlob), opt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no crazy one-liners
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also this was a small perf optimization (checking for length 1) that can probably be removed now that we are ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the one-liner: I've put it in a single line for consistency with this line below. Should I just add some line breaks in the middle of the expression or create unnecessary temp vars?
About length === 1
: AFAICS the same perf benefit still holds after this patch. That is, although the globs are now ordered, the file emission order inside each individual stream remains unordered. I don't quite see why this optimization should be removed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@UltCombo I switched from coffee -> JS then wrote this lib so I was still fond of one-liners. Now I understand how much they mess up code readability by not having meaningful variable names for important stuff. If you could remove any crazy one liners and use var names to update it to the new (and better) style that I've adopted across these projects that would be cool 🍻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, wouldn't you have a style guide around? It is a bit hard to determine the proper style without one.
I've pushed a new commit, PTAL.
a8a7011
to
bf90ab1
Compare
bf90ab1
to
afda4f5
Compare
I've removed all one-liners and did some clean up. |
I'm just not sure whether the variable names are meaningful enough, please give suggestions if they're not good enough. |
Looks good to me. I will publish a new major release of glob-stream with this since it is a breaking change, and then a major release for vinyl-fs which depends on this new version, then gulp 4 will depend on the new vinyl-fs which will have a few other breaking changes. Thanks for the fix! Would you mind commenting on any relevant issues and link to this so I can go close them? |
Respect globs array order, fixes #25
Thanks for the roadmap clarification.
I can try, although I'm not very familiar with those repositories' issues. |
@UltCombo Search for "symlink" and that should make it easy. If you have time can you send a PR to the vinyl-fs docs to add a note about glob ordering? |
I've seen 3 or so issues about symlinks so far but they didn't appear to be related to this issue at all. By the way, shouldn't we update this repository's docs as well? |
@UltCombo Oops wrong comment wrong issue, my bad. I'm adding to the vinyl-fs docs, consider this all finished. Thanks! |
Hahah no problem, thanks for all the hard work maintaining the gulp project. 😄 |
Have you seen https://github.com/es128/anymatch ? I think this would clean up the code a lot. Not sure if it respects ordering |
Nope, I haven't seen it before. I can try it tomorrow when I have a bit more free time. Though, right now I'm not sure how anymatch would be really useful here -- as far as I can see, it does not read files from disk, so the ordering logic I've implemented would still have to be kept. If anything, anymatch would simplify the |
Oh wait, if it does respect the globs order then we might be able to replace the logic I've implemented (little sleepy, not sure honestly). I'll check tomorrow. |
@contra Doesn't look like anymatch supports negative/ordered globs. We could use multimatch instead, which is basically minimatch with added support for multiple/ordered globs. Or, we could go further and use gulp-filter which wraps multimatch in a stream. But I'm not sure -- even if we do this refactoring, it will all go to waste when fixing #24 as it will require the negative globs logic to be implemented in the globber. |
@UltCombo I don't see node-glob adding that any time soon, it's a pretty huge undertaking considering the size of that codebase |
@contra I was thinking about using a different globber. But that would probably bring too many breaking changes, and node-glob seems to be the ubiquitous solution nowadays. Anyway, what do you think about using globby? It extends node-glob adding multiple ordered globs support. Although, its negative filtering implementation is very different from glob-stream -- while glob-stream handles exclusion through minimatch, globby globs the files that would be matched by the negative globs and then excludes them from the matched files array. We'd need some perf tests. |
@UltCombo I'm totally open to changing it. I don't think there would be any breaking changes actually, our use of node-glob is really basic and abstracted by glob-stream. If you send a PR to switch globbers and all of the tests still pass I'd merge for perf increases |
Thing is, I believe many gulp users rely on node-glob features (or rather, minimatch features) on their globby just wraps node-glob, so it preserves the same set of features from node-glob. I'm just not sure about perf, as globby matches the negative globs against the file system instead of negating through minimatch as glob-stream does. That's why we need to do perf tests first. Anyway, if there's a perf problem with globby, I believe @sindresorhus can help us here. |
@UltCombo If we are going to switch globbers it should be now since we are about to push 4 which has a bunch of breaking changes anyways |
Yes, I'm just saying if we switch to globby then there is no breaking change other than the ordered globs feature which we already landed in. |
@UltCombo Want to play around with both routes and see which performs better? |
Sure, I'll prepare some test cases based on #24's use case. |
Here's a quick and dirty gist. Although glob-stream has the overhead of creating streams for the matched files (which is not really significant as there is only 1 matched file in my test), glob-stream is still over 2 times faster than globby. Not sure what to do here. I believe the correct would be to propose an improvement to globby's negative globs handling (that is, to use the same logic as glob-stream currently uses), I'm sure @sindresorhus is open to improvements in his packages. After that, we can change the globber here and remove the multiple glob handling logic (which is then handled at globby). I'm just not sure as this is a decent amount of work for technically no perf gain and no significant changes, just cleaner code here. |
Taking a second look at globby, it uses I'll raise all these issues in the globby repository, but for now it seems better to stay with glob-stream's current multiple globs handling logic. |
Alright, I've reported these issues (Negative globs handling is very slow and Multiple positive globs handling is slow). Fixing these would mean pretty much a complete rewrite of globby, letting alone switching to globby would have one more perf issue: globby only returns the matched paths after it finishes all the globing work, while glob-stream can emit files as soon as it matches them. This perf issue can't be patched in globby unless an events or streaming API is added to globby. |
I would be happy to improve |
What is globby doing that has any advantage perf-wise over what glob-stream is doing currently? |
@contra erm, none I guess. As far as I can see, both packages have the same functionality (except that glob-stream has streaming added) and my initial suggestion to use globby was just to not reinvent the wheel, but I was unaware of the globby perf issues until then. It seems like it would be possible to move 90% of glob-stream to globby thus solving its perf issues, but at that point it might as well be possible to add streaming support to globby and there would be no reason to have this package then. That is, we would just be moving things around without any real benefit. |
@UltCombo Yeah it seems like we're just moving everything to globby and then turning it into an identical module. I don't see the purpose in that, I think we need to get back to the core of the problem which was a globber that supports multiple globs and checks them while crawling, not another system that grabs all the results and then checks them. |
I'll put a $500 dollar bounty on somebody adding support for multiple globs to node-glob |
Agreed, the idea to use globby was just to clean up code, that is, to not duplicate functionality that already exists in another package. But given the perf issues, and that even after fixing the globby perf issues we'd still have the same perf as now, it doesn't really help much. Thing is, I've looking around the npm registry and other resources and can't seem to find a suitable glob library -- looks like node-glob is the ubiqutuous globber. I believe @sindresorhus has a better hold of the Node.js environment than me, but it seems a little controversial to ask him for a package that does exactly the same thing as his globby package does but better. |
I just commented here isaacs/node-glob#84 Maybe some $$$ will get somebody motivated to fix this. I'm locked up on time right now and I'm sure @isaacs is busy running his company too |
Woah, looks like our issues are about to be solved instantly. 😆 |
I too would prefer to have node-glob support this. @contra Create a bountysource and tweet it, and I'll make sure to put in some $$ too and get it RTed ;) |
@sindresorhus tweeted |
This and the file watching are like the last bottlenecks/bug generators in gulp |
@jonschlinkert and I have run into the a lot of the same issues with The 3 areas that we find important for this new project are:
Since we share a lot of the same goals, we'd like to get your input on sensible defaults (like excluding Also, @contra mentioned in #24 looking for a potential replacement to |
hey thanks @doowb. yeah, to second what Brian's saying we went down this same path and came to the realization that the problems are systemic. e.g. we can't speed up node-glob by adding another method or by changing the negation strategy in a dependent lib. it just needs to be re-written from the ground up with a different filtering strategy. since minimatch and node-glob were initially created we've learned a lot about what a globbing lib should do for node.js, so the goal is to focus on features that make sense for node.js (versus Bash 4.3 parity). e.g. you'll still use patterns the same way, 99% of end users won't know the difference, but we'll make it easier to define negation patterns, we're considering excluding node_modules by default and only globbing those files if node_modules is explicitly defined (rather than the other way around). etc. these are just example, the goal is to create a glob lib that does what we need. anyway, we'll be releasing the new globbing lib in the next week or so, it would be great to get your feedback/wishlist regarding the features that would be most important to gulp. obviously no expectations that you guys will use it or whatever |
@jonschlinkert Why not develop it out in the open? The cathedral approach is too bad :( |
yeah, sure thing, sorry my language was misleading:
meant we'll be "creating" a new lib. e.g:
we've been discussing it, but no actual code yet. I'd love to collaborate on features, everyone wins |
@jonschlinkert perhaps it would be nice to have a repository where we can open issues for such discussions? |
done https://github.com/jonschlinkert/glob-fs/issues. a wishlist would be a great starting point. I'm about to share some cookies with a 1yr old and 4yr old :) so I'll try to start a list later this week. but feel free to get something started |
Okay, I started a wishlist here: micromatch/glob-fs#1 to get some discussion started |
Respect globs array order, fixes #25
Respect globs array order, fixes #25
Fixes #25
Fixes gulpjs/gulp#837