-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
experimental traits to more cleanly extend to key-value stores #159
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #159 +/- ##
==========================================
- Coverage 91.91% 91.61% -0.30%
==========================================
Files 12 12
Lines 1224 1228 +4
==========================================
Hits 1125 1125
- Misses 99 103 +4
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like the idea as a way to create more clear cutoffs for unsupported functionality. I do worry that this isn't that much of a change from just moving more of the default behaviour to just work on SystemPath.
Base.isdir(fp::AbstractPath) = isdir(mode(fp)) | ||
|
||
Base.isdir(fp::AbstractPath) = isdir(directoriestype(fp), fp) | ||
Base.isdir(::DirectoriesExplicit, fp::AbstractPath) = isdir(mode(fp)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the assumption is that if I have:
directoriestype(::Type{MyPath}) = DirectoriesImplicit()
Then I also need to implement Base.isdir(::DirectoriesImplicit, fp::MyPath)
? So the benefit is just when you throw the MethodError
... or are we expecting folks to extend DirectoriesType
with some shared logic that could be shared with multiple key-value path types?
Base.isfile(fp::AbstractPath) = isfile(mode(fp)) | ||
|
||
#TODO: most of these clearly only make sense on local FS, should they have traits? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a trait for this is a very good idea if the primary goal is to more cleanly fail for unsupported functionality. This could still get weird in the sense that S3Paths do still have permissions, but they work differently.
I'm growing a little skeptical of my own idea. On one hand, I definitely stand by the idea that this package assumes too much of the underlying file system implementations, but on the other, I think we should be very careful not to completely overhaul this package based on S3 alone. Yes, in principle most of what would be done for S3 could be extended to other key-value stores, but S3 is the only use case that I feel certain will be used. I think it would be productive if we could collectively come up with some more concrete examples of how this package might be used in the future. If those are too hard to come by, maybe generalizing this is a bad idea. |
I think a few obvious examples include:
One thing that keeps bugging me is that there's a weird overlap between Dictionaries.jl, FilePathsBase.jl, FileTrees.jl and AxisSets.jl in which we often want to just describe algorithms for search/navigating prefixes against arbitrary objects (e.g., local files, remote objects, in-memory julia types). I wonder if some of the S3 issues would be better served with a kind of index API which must be implemented independent of authorization or read/write or permissions logic? |
Ok, that's definitely more examples than I had come up with. Also @omus suggested something like this in the other thread, I think not realizing I had done this. My interest in this is renewed, I will work on it some more some time this week. |
I've revisited this today. On further reflection, it does seem that extending the interface this way is a good idea especially for AWSS3, but it is slightly more complicated than I had hoped. The main breaking change I propose to the interface as of now is that |
So the issue is that |
I don't know yet... regardless I don't think there's going to be any avoiding some breaking changes (though I think they can be minor). |
This is a (very small) first step of an experimental implementation of traits to attempt to address some of the problems with this issue.
Currently, FilePathsBase makes quite a lot of assumptions about the semantics of paths, I think it's fair to say that it assumes that paths are either POSIX compatible or windows. This is arguably unnecessarily restrictive to use cases which mirror these very closely.
In this draft I had created the traits
DirectoriesType
andPermissionsType
. The former is to indicate whether directories are their own objects in a file system or whether they are merely implied by the path structure, the latter being the case for key-value stores such as S3.PermissionsType
is whether file or directory permissions are supported by the file system.An objection I anticipate to this proposal is that key-value stores are not file systems and it is a mistake to treat them like one. This is a wise objection, however there seem to be a number of cases (notably AWS S3, but also other AWS key-value stores) in which the key-value store is so commonly treated as a file system that using methods such as those provided by FilePathsBase is unavoidable. Indeed, the idea that S3 is a file system seems to be deeply ingrained in much of the "big data" ecosystem (horrific though that ecosystem might be). Therefore, I think it is at least worth asking whether FilePathsBase can be made general enough to handle those cases more seamlessly.
The "demo" I have in this draft defines methods in terms of intermediate methods which check the interface, so that attempts to use unimplemented methods will result in method errors. The real work to be done is re-writing some of the fall-back methods so that they make sense with the provided traits (I haven't done even a single example of this yet).
Obviously, I have done very little work here, but I wanted to gauge interest and receptiveness to this idea before I put any real time into this.