Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are results non-deterministic? #131

Closed
ozum opened this issue Jul 15, 2019 · 5 comments
Closed

Are results non-deterministic? #131

ozum opened this issue Jul 15, 2019 · 5 comments

Comments

@ozum
Copy link

ozum commented Jul 15, 2019

Hi,

I looked in docs and issues, but could not find an answer. Order of returned files of globby with same directory structure and and same glob patterns change from execution to execution.

In one of my projects, some tests depend on order of returned files, and those tests fail:

90% of time (just observation, not instrumented)

a/a
b/b1
b/b2

vs.

10% of time

b/b1
b/b2
a/a

Is this expected behavior?

Thanks,

@sindresorhus
Copy link
Owner

Yeah, order is not guaranteed as the underlying package used for globbing fast-glob doesn't guarantee the order:

meanwhile results are returned in arbitrary order. Quick, simple, effective. - https://github.com/mrmlnc/fast-glob

We could maybe sort the results, but it's not clear what downsides that might have.

@mrmlnc Is there a reason you're not just sorting the resulting array before returning it?

@mrmlnc
Copy link
Contributor

mrmlnc commented Jul 22, 2019

Is there a reason you're not just sorting the resulting array before returning it?

The reason for the non-deterministic order is parallelism when working with the file system. The fast-glob package uses a width-first search algorithm (we trying to read all the directories in the directory in parallel). The node-glob package uses a depth-first search algorithm.

I have several reasons for this:

  1. Sorting is not a quick process (the goal of my package is to provide results as quickly as possible).
  2. The fast-glob has a Stream API and we can't control the order there.

You can just sort the results if required without any problems.

@ozum
Copy link
Author

ozum commented Jul 22, 2019

Thanks for the clarification. @sindresorhus, it may be helpful (for others) to write in README that results are non-deterministic. I assumed (my mistake) results are ordered until some of my tests failed very lately.

As you indicated in your answer fast-glob states this situation, but usually people don't read docs of dependencies of dependencies.

By the way, also @mrmlnc many thanks for fast-glob and @sindresorhus for globby.

@thorn0
Copy link

thorn0 commented Feb 23, 2020

The reason for the non-deterministic order is parallelism when working with the file system.

@mrmlnc Does that apply only to the async API? Or does the return value of fastGlob.sync(...) have non-deterministic order too?

@mrmlnc
Copy link
Contributor

mrmlnc commented Feb 23, 2020

The synchronous API must return results in a deterministic order. I know of only one case where order can be broken:

JFYI: https://github.com/nodelib/nodelib/blob/8e0d8b889e282690bb5e2bfe2442f2f099438e22/packages/fs/fs.walk/src/readers/sync.ts#L7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants