-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
glob: more flexible than walkDirRec
#15598
Conversation
Is there any reason that this module should be in stdlib? Maybe fusion is a better choice? Also, did you see https://github.com/citycide/glob? It offers a complete cross-platform glob solution too |
I'm open but wanted to explore std as an option first given that its IMO general purpose enough for std (eg kochdocs relies on it and could be useful for tests too).
I'm quite familiar with it (I've contributed via several PR's and design discussions / bug reports). I use pkg/glob a lot, but I've written this to address several shortcomings in pkg/glob, see following points: simpler thanks to separation of concernthis PR's What it does concern itself is what's impossible to do on calling side, such as sorting and controlling iteration (BFS vs DFS). Doing it on calling side would otherwise require storing all elements in an array, which would be probihitive / inefficient. return results in natural tree orderpkg/glob doesn't return in natural tree order: import pkg/glob
for path,kind in walkGlobKinds("./**", options=defaultGlobOptions + {GlobOption.Directories}): echo path can return:
instead of:
which is what tools like unix find or python3 glob or this PR do import glob
for name in glob.glob('./**', recursive = True): print(name) BFS vs DFSDFS search is the default, but BFS is also supported, which is useful in some applications. sortingthis PR allows returning in global sorted order without requiring to fit all entries in memory. See details in PR. |
I'd be happy for my code to be subsumed by a std lib solution, but if it's decided that a package is a better place I would also be glad to add additional collaborators to https://github.com/citycide/glob (and in fact @timotheecour already has that access) for these changes to be discussed and implemented there — without my sole power over it. Edit: I could also move my |
193af4e
to
60a7628
Compare
Not acceptable for the following reasons:
|
ok so can I migrate this to a new fusion PR?
happy to change name |
@Araq This PR was closed prematurely, can we re-open? Happy to make whatever improvements are needed for stdlib inclusion. here's another recent example where this PR comes handy: #15612 this PR takes care of the nasty details in an efficient (single yield statement) and generic way (callbacks) so client code doesn't have to keep re-inventing this functionality and fall in the common traps:
# main.nim
import std/[globs,os,sugar]
for ai in glob(".", relative = true, sortCmp = (a,b) => cmp(a.path, b.path)):
if ai.kind == pcFile: echo ai.path => note that multithreading could be maybe added in future for further performance gains, the details of which should be done once in a library instead of re-implemented (potentialy poorly) in all clients. Likely one can do better than
|
If the iterator takes a callback you might as well turn it into a proc then that takes a callback. Besides, this is what we have "fusion" for. |
find
(not sure why); I tested on a dir with 500k recursive entries (filesystem was hot, ie after a coldstart)links
walkDirRecFilter
recursive glob with follow filter (has full implementation) RFCs#261limitation