You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I was looking for a Python implementation to match files with the rules defined in .gitignore files and this project is great!
My use case is to synchronize directories across a network and most of the control logic (filter, compare, update) is at the inode level to allow me to maximize the number of skipped elements (to not explore excluded directories for example).
I would like to update my current filter logic to support git patterns: given a list of patterns, is my file path matched or not ? The issue is that currently pathspec seems to be heavily oriented around processing lists of paths, what if I have a single file ?
Here is what my current implementation boils down to:
spec=pathspec.PathSpec.from_lines(pathspec.GitIgnorePattern, patterns)
defmatch_file(file_path):
returnlen(list(spec.match_files([file_path]))) >0# This should not be so complicatedis_ignored=match_file(u'testfile.py')
As you can see, it's pretty cumbersome: I have to create a a collection with a single item, run the matcher and then extract the result.
Ideally, I would imagine that PathSpec exposes a match_file function returning a boolean and match_files (or filter_files since it's currently acting as a filter ?) would just reuse it:
classPathSpec(object):
# ...defmatch_file(self, file, separators=None): # Core logicnorm, path=util.normalize_file(file, separators=separators) # Single file versionis_matched=util.match_file(self.patterns, norm) # Single file versionreturnis_matched# booldefmatch_files(self, files, separators=None): # Quality of life function: it just replaces a one line generatorreturn (fileforfileinfilesifself.match_file(file, separators))
Basically, it boils down to the fact the library does not expose single item functions to let me iterate other my files as I want but hides a loop inside every function.
What do you think about adding better support for single file matching ? I am aware that due to the current architecture of the library, it would require some refactoring but I believe that it would be for the best. Could you implement it or should I do it and send a PR (since it's a big change, I'd rather wait for your feedback)
Side note: the real name of the gitignore matcher is wildmatch. How about adding this as an alias name when registering the pattern ? Your module deserves to be better referenced (I had some troubles to find it even if I knew what I was looking for).
The text was updated successfully, but these errors were encountered:
@demurgos That is quite cumbersome to match files one at a time. In the next release, I'll add PathSpec.match_file. Thanks for pointing out the proper name for the pattern matching git uses for ".gitignore". I tried searching for its name when I started this project, but I came up with nothing.
PathSpec.match_file has been implemented, and GitIgnorePattern has been renamed to GitWildMatchPattern. The GitIgnorePattern is still available for backward compatibility.
Hi,
I was looking for a Python implementation to match files with the rules defined in
.gitignore
files and this project is great!My use case is to synchronize directories across a network and most of the control logic (filter, compare, update) is at the inode level to allow me to maximize the number of skipped elements (to not explore excluded directories for example).
I would like to update my current filter logic to support git patterns: given a list of patterns, is my file path matched or not ? The issue is that currently
pathspec
seems to be heavily oriented around processing lists of paths, what if I have a single file ?Here is what my current implementation boils down to:
As you can see, it's pretty cumbersome: I have to create a a collection with a single item, run the matcher and then extract the result.
Ideally, I would imagine that PathSpec exposes a
match_file
function returning a boolean andmatch_files
(orfilter_files
since it's currently acting as a filter ?) would just reuse it:Basically, it boils down to the fact the library does not expose single item functions to let me iterate other my files as I want but hides a loop inside every function.
What do you think about adding better support for single file matching ? I am aware that due to the current architecture of the library, it would require some refactoring but I believe that it would be for the best. Could you implement it or should I do it and send a PR (since it's a big change, I'd rather wait for your feedback)
Side note: the real name of the gitignore matcher is wildmatch. How about adding this as an alias name when registering the pattern ? Your module deserves to be better referenced (I had some troubles to find it even if I knew what I was looking for).
The text was updated successfully, but these errors were encountered: