Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect globs array order, fixes #25 #27

Merged
merged 1 commit into from
Dec 28, 2014
Merged

Conversation

UltCombo
Copy link
Contributor

Fixes #25
Fixes gulpjs/gulp#837

@coveralls
Copy link

Coverage Status

Coverage increased (+0.39%) when pulling ea2f8e2 on UltCombo:globs-order into 472b98e on wearefractal:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.39%) when pulling a8a7011 on UltCombo:globs-order into 472b98e on wearefractal:master.


if (positives.length === 0) throw new Error("Missing positive glob");

// only one positive glob no need to aggregate
if (positives.length === 1) return gs.createStream(positives[0], negatives, opt);
if (positives.length === 1) {
return gs.createStream(positives[0].glob, negatives.filter(indexGreaterThan(positives[0].index)).map(toGlob), opt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no crazy one-liners

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this was a small perf optimization (checking for length 1) that can probably be removed now that we are ordering

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the one-liner: I've put it in a single line for consistency with this line below. Should I just add some line breaks in the middle of the expression or create unnecessary temp vars?

About length === 1: AFAICS the same perf benefit still holds after this patch. That is, although the globs are now ordered, the file emission order inside each individual stream remains unordered. I don't quite see why this optimization should be removed now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@UltCombo I switched from coffee -> JS then wrote this lib so I was still fond of one-liners. Now I understand how much they mess up code readability by not having meaningful variable names for important stuff. If you could remove any crazy one liners and use var names to update it to the new (and better) style that I've adopted across these projects that would be cool 🍻

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, wouldn't you have a style guide around? It is a bit hard to determine the proper style without one.
I've pushed a new commit, PTAL.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.48%) when pulling bf90ab1 on UltCombo:globs-order into 472b98e on wearefractal:master.

@UltCombo
Copy link
Contributor Author

I've removed all one-liners and did some clean up.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.43%) when pulling afda4f5 on UltCombo:globs-order into 472b98e on wearefractal:master.

@UltCombo
Copy link
Contributor Author

I'm just not sure whether the variable names are meaningful enough, please give suggestions if they're not good enough.

@yocontra
Copy link
Member

Looks good to me. I will publish a new major release of glob-stream with this since it is a breaking change, and then a major release for vinyl-fs which depends on this new version, then gulp 4 will depend on the new vinyl-fs which will have a few other breaking changes. Thanks for the fix! Would you mind commenting on any relevant issues and link to this so I can go close them?

yocontra pushed a commit that referenced this pull request Dec 28, 2014
Respect globs array order, fixes #25
@yocontra yocontra merged commit 18a1a25 into gulpjs:master Dec 28, 2014
@UltCombo UltCombo deleted the globs-order branch December 28, 2014 00:14
@UltCombo
Copy link
Contributor Author

Thanks for the roadmap clarification. :)

Would you mind commenting on any relevant issues and link to this so I can go close them?

I can try, although I'm not very familiar with those repositories' issues.

@yocontra
Copy link
Member

@UltCombo Search for "symlink" and that should make it easy.

If you have time can you send a PR to the vinyl-fs docs to add a note about glob ordering?

@UltCombo
Copy link
Contributor Author

I've seen 3 or so issues about symlinks so far but they didn't appear to be related to this issue at all.

By the way, shouldn't we update this repository's docs as well?

@yocontra
Copy link
Member

@UltCombo Oops wrong comment wrong issue, my bad. I'm adding to the vinyl-fs docs, consider this all finished. Thanks!

@UltCombo
Copy link
Contributor Author

Hahah no problem, thanks for all the hard work maintaining the gulp project. 😄

@yocontra
Copy link
Member

Have you seen https://github.com/es128/anymatch ? I think this would clean up the code a lot. Not sure if it respects ordering

@UltCombo
Copy link
Contributor Author

Nope, I haven't seen it before. I can try it tomorrow when I have a bit more free time.

Though, right now I'm not sure how anymatch would be really useful here -- as far as I can see, it does not read files from disk, so the ordering logic I've implemented would still have to be kept. If anything, anymatch would simplify the isMatch and filterNegatives functions.

@UltCombo
Copy link
Contributor Author

Oh wait, if it does respect the globs order then we might be able to replace the logic I've implemented (little sleepy, not sure honestly). I'll check tomorrow. :D

@UltCombo
Copy link
Contributor Author

@contra Doesn't look like anymatch supports negative/ordered globs. We could use multimatch instead, which is basically minimatch with added support for multiple/ordered globs. Or, we could go further and use gulp-filter which wraps multimatch in a stream.

But I'm not sure -- even if we do this refactoring, it will all go to waste when fixing #24 as it will require the negative globs logic to be implemented in the globber.

@yocontra
Copy link
Member

@UltCombo I don't see node-glob adding that any time soon, it's a pretty huge undertaking considering the size of that codebase

@UltCombo
Copy link
Contributor Author

@contra I was thinking about using a different globber. But that would probably bring too many breaking changes, and node-glob seems to be the ubiquitous solution nowadays.

Anyway, what do you think about using globby? It extends node-glob adding multiple ordered globs support. Although, its negative filtering implementation is very different from glob-stream -- while glob-stream handles exclusion through minimatch, globby globs the files that would be matched by the negative globs and then excludes them from the matched files array. We'd need some perf tests.

@yocontra
Copy link
Member

@UltCombo I'm totally open to changing it. I don't think there would be any breaking changes actually, our use of node-glob is really basic and abstracted by glob-stream. If you send a PR to switch globbers and all of the tests still pass I'd merge for perf increases

@UltCombo
Copy link
Contributor Author

Thing is, I believe many gulp users rely on node-glob features (or rather, minimatch features) on their gulp.src() globs, that's why I'd consider changing the globber a breaking change.

globby just wraps node-glob, so it preserves the same set of features from node-glob. I'm just not sure about perf, as globby matches the negative globs against the file system instead of negating through minimatch as glob-stream does. That's why we need to do perf tests first. Anyway, if there's a perf problem with globby, I believe @sindresorhus can help us here. ;)

@yocontra
Copy link
Member

@UltCombo If we are going to switch globbers it should be now since we are about to push 4 which has a bunch of breaking changes anyways

@UltCombo
Copy link
Contributor Author

Yes, I'm just saying if we switch to globby then there is no breaking change other than the ordered globs feature which we already landed in.

@yocontra
Copy link
Member

@UltCombo Want to play around with both routes and see which performs better?

@UltCombo
Copy link
Contributor Author

Sure, I'll prepare some test cases based on #24's use case. ;)

@UltCombo
Copy link
Contributor Author

Here's a quick and dirty gist. Although glob-stream has the overhead of creating streams for the matched files (which is not really significant as there is only 1 matched file in my test), glob-stream is still over 2 times faster than globby.

Not sure what to do here. I believe the correct would be to propose an improvement to globby's negative globs handling (that is, to use the same logic as glob-stream currently uses), I'm sure @sindresorhus is open to improvements in his packages. After that, we can change the globber here and remove the multiple glob handling logic (which is then handled at globby).

I'm just not sure as this is a decent amount of work for technically no perf gain and no significant changes, just cleaner code here.

@UltCombo
Copy link
Contributor Author

Taking a second look at globby, it uses async.reduce which runs in series, that is another perf loss regarding multiple positive globs.

I'll raise all these issues in the globby repository, but for now it seems better to stay with glob-stream's current multiple globs handling logic.

@UltCombo
Copy link
Contributor Author

Alright, I've reported these issues (Negative globs handling is very slow and Multiple positive globs handling is slow). Fixing these would mean pretty much a complete rewrite of globby, letting alone switching to globby would have one more perf issue: globby only returns the matched paths after it finishes all the globing work, while glob-stream can emit files as soon as it matches them. This perf issue can't be patched in globby unless an events or streaming API is added to globby.

@sindresorhus sindresorhus mentioned this pull request Dec 30, 2014
3 tasks
@sindresorhus
Copy link
Contributor

I would be happy to improve globby to make it workable for gulp, but I'm short on time atm as I'm traveling, so probably won't happen in the short run unless I get some help.

sindresorhus/globby#5

@yocontra
Copy link
Member

What is globby doing that has any advantage perf-wise over what glob-stream is doing currently?

@UltCombo
Copy link
Contributor Author

@contra erm, none I guess. As far as I can see, both packages have the same functionality (except that glob-stream has streaming added) and my initial suggestion to use globby was just to not reinvent the wheel, but I was unaware of the globby perf issues until then.

It seems like it would be possible to move 90% of glob-stream to globby thus solving its perf issues, but at that point it might as well be possible to add streaming support to globby and there would be no reason to have this package then. That is, we would just be moving things around without any real benefit.

@yocontra
Copy link
Member

@UltCombo Yeah it seems like we're just moving everything to globby and then turning it into an identical module. I don't see the purpose in that, I think we need to get back to the core of the problem which was a globber that supports multiple globs and checks them while crawling, not another system that grabs all the results and then checks them.

@yocontra
Copy link
Member

I'll put a $500 dollar bounty on somebody adding support for multiple globs to node-glob

@UltCombo
Copy link
Contributor Author

Agreed, the idea to use globby was just to clean up code, that is, to not duplicate functionality that already exists in another package. But given the perf issues, and that even after fixing the globby perf issues we'd still have the same perf as now, it doesn't really help much.

Thing is, I've looking around the npm registry and other resources and can't seem to find a suitable glob library -- looks like node-glob is the ubiqutuous globber.

I believe @sindresorhus has a better hold of the Node.js environment than me, but it seems a little controversial to ask him for a package that does exactly the same thing as his globby package does but better.

@yocontra
Copy link
Member

I just commented here isaacs/node-glob#84

Maybe some $$$ will get somebody motivated to fix this. I'm locked up on time right now and I'm sure @isaacs is busy running his company too

@UltCombo
Copy link
Contributor Author

I'll put a $500 dollar bounty on somebody adding support for multiple globs to node-glob

Woah, looks like our issues are about to be solved instantly. 😆

@sindresorhus
Copy link
Contributor

I too would prefer to have node-glob support this.

@contra Create a bountysource and tweet it, and I'll make sure to put in some $$ too and get it RTed ;)

@yocontra
Copy link
Member

@sindresorhus tweeted

@yocontra
Copy link
Member

This and the file watching are like the last bottlenecks/bug generators in gulp

@doowb
Copy link
Member

doowb commented Dec 31, 2014

@jonschlinkert and I have run into the a lot of the same issues with node-glob on our own projects and started working on a replacement. @jonschlinkert already created an awesome project called micromatch that solves some of these issues and is 10x-20x faster than minimatch

The 3 areas that we find important for this new project are:

  • faster matching and handling of negation patterns - micromatch
  • more sensible defaults for node.js vs bash - micromatch + new project
  • most importantly, the strategy node-glob uses when hitting the file system - new project

Since we share a lot of the same goals, we'd like to get your input on sensible defaults (like excluding node_modules unless specified in a pattern) and feedback when the new project is published.

Also, @contra mentioned in #24 looking for a potential replacement to node-glob and I'm confident this new project will suit your needs.

@jonschlinkert
Copy link
Contributor

hey thanks @doowb. yeah, to second what Brian's saying we went down this same path and came to the realization that the problems are systemic. e.g. we can't speed up node-glob by adding another method or by changing the negation strategy in a dependent lib. it just needs to be re-written from the ground up with a different filtering strategy. since minimatch and node-glob were initially created we've learned a lot about what a globbing lib should do for node.js, so the goal is to focus on features that make sense for node.js (versus Bash 4.3 parity). e.g. you'll still use patterns the same way, 99% of end users won't know the difference, but we'll make it easier to define negation patterns, we're considering excluding node_modules by default and only globbing those files if node_modules is explicitly defined (rather than the other way around). etc. these are just example, the goal is to create a glob lib that does what we need.

anyway, we'll be releasing the new globbing lib in the next week or so, it would be great to get your feedback/wishlist regarding the features that would be most important to gulp. obviously no expectations that you guys will use it or whatever

@yocontra
Copy link
Member

@jonschlinkert Why not develop it out in the open? The cathedral approach is too bad :(

@jonschlinkert
Copy link
Contributor

yeah, sure thing, sorry my language was misleading:

we'll be releasing the new globbing lib in the next week

meant we'll be "creating" a new lib. e.g:

it would be great to get your feedback/wishlist regarding the features that would be most important to gulp

we've been discussing it, but no actual code yet. I'd love to collaborate on features, everyone wins

@UltCombo
Copy link
Contributor Author

@jonschlinkert perhaps it would be nice to have a repository where we can open issues for such discussions?

@jonschlinkert
Copy link
Contributor

done https://github.com/jonschlinkert/glob-fs/issues. a wishlist would be a great starting point. I'm about to share some cookies with a 1yr old and 4yr old :) so I'll try to start a list later this week. but feel free to get something started

@jonschlinkert
Copy link
Contributor

Okay, I started a wishlist here: micromatch/glob-fs#1 to get some discussion started

phated pushed a commit that referenced this pull request Feb 16, 2017
Respect globs array order, fixes #25
phated pushed a commit that referenced this pull request Feb 16, 2017
Respect globs array order, fixes #25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants