Skip to content
This repository has been archived by the owner on Jun 19, 2023. It is now read-only.

[Draft] File system interface #30

Closed
wants to merge 3 commits into from
Closed

Conversation

djdv
Copy link

@djdv djdv commented May 8, 2019

*Draft Notes -I'll update this as things become more clear, but please critique heavily in the meantime.

I propose that we expose a filesystem interface, that allows developers to treat some object, as an index comprised of generic nodes. Ultimately allowing access to content living in IPFS APIs, in the same unified way that the os pkg exposes access to nodes on disks and other media.

In essence, it shouldn't matter what API we need to interact with, just that some operation be performed on it.
In the same way you can os.OpenFile on any valid OS path, regardless of if it resides on a disk, nfs, etc.
you should be able to call ipfs.OpenFile with any valid IPFS path,, regardless of if it resides in IPFS, IPNS, MFS, UnixFS, et al.

i.e.

	f, err := ipfs.OpenFile("/ipfs/Qm.../file.ext", flags, perm)
	if err != nil {
		//...
	}
	defer f.Close()
	x, y := f.Read(someByteSlice)

The utility of providing this interface, affords us the ability to implement a standard filesystem pkg that people may wrap at a Go-API level, to supply their own specific logic around file system operations, without having to implement operation bindings for every API we have, and while conforming to a single standard.
Our own use cases include; Using this to implement the FUSE interface, and in a meta fashion, exposing IPFS as a "file system" to Go, in a way similar to MFS.

Included in the patchset are
interface.go
Which defines the interface for a filesystem and standard IO types.
index.go
Which is a partial implementation of this, showing off a slash(/) delimited, namespace oriented, filing system.
client.go
Which shows of how 3rd party developers would use these interfaces, and construct their own.

Not included are the parser and node implementations. The interfaces around these may change
But they are very simple bindings to our APIs. You can imagine an IPFS namespace parser to require a handle to the coreapi, which is used during FS+IO operations. Others are similar.

TODO: ...

Additional context:
You can see some of the lineage of this here: ipfs/kubo#5003 (comment)

@magik6k magik6k self-requested a review May 8, 2019 21:11
@hannahhoward
Copy link
Contributor

hannahhoward commented May 10, 2019

Having reviewed this, and the corresponding context, my first thought is that naming is key here to avoid confusion, and made complicated by the fact that we already have a bunch of concepts in IPFS with the term "file system" in them. So that deserves some thought.

If I understand correctly, in the FUSE case the hierarchy looks something like something like this, going from Blocks to FUSE:

  • Blocks
  • UnixFs / MFS / IPFS / UnixFSv2
  • Implementations of FsNode for all of the above (is this implemented on CoreAPI interfaces to these? Or on the packages themselves directly?)
  • Implementations of a ParseFN for all the above
  • A default implementation of Index + Filesystem
  • A FUSE FileSystemInterface that uses said Index + Filesytem
  • FUSE

Am I getting it right?

Here are some thoughts:

  • Before we add all these APIs to the already large core interface surface -- have we identified the concrete cases where a user of IPFS would specifically like to interact with IPFS this way, other than in backing FUSE?
  • Core Interface I believe indicates we're going to service this over multiple protocols -- at the command line, over HTTP, via gRPC eventually. (and also, I can't tell if go-ipfs-core = IPFS core API in general = needs to land in javascript as well?). Your discussions discuss specifically interacting at the Go API level. May all this should just be a package of its own? I.e. go-ipfs-file-system -- that someone can decide to employ if it fits their specific needs?
  • I would consider introducing the word "Generic" in front of these interface names, or some other term to indicate this is a a meta file system interface that operates mostly on top of UnixFs/MFS/etc.
  • Right now, I can provide custom implementations for:
    • an FsNode
    • a ParseFn
    • an Index
    • a Filesystem

My first thought is the Index should be an explicit data member of FileSystem, not an embedded interface (i.e. composition over inheritance, sorta?). I can see lots of reasons to customize the index, not a whole number of reasons to write a custom implementation of OpenFile. In fact, I think allowing a custom implementation of FileSystem could open us up to pretty bizarre behavior potientially (i.e. do we really want anyone implementing OpenFile by calling YieldTo(ctx, unix.TDirectory) -- eek!)

On a broader level, my meta comment is about how generic we need to get. I can see that you have to implement FsNode for UnixFs / MFS / IPFS / UnixFSv2 to support FUSE (I can't completely tell if you have to implement FileSystem/Index purely to support FUSE). And then it makes sense that they are useful generic interfaces, and people certainly might want to make more FsNode types specifically for the FUSE case. So it certainly makes sense to expose them, and provide a mechanism for implementing more FSNode types in the case of FUSE (maybe this is where the need for Index/FileSystem comes in?). The question is whether the abstraction should keep going, and escalate to providing these as generics in go-core-interface, for implementing generic indexes and generic filesystems in support of generic needs.

I'm not really sure what the right answers are here without more context, just want to surface these questions.

@hannahhoward
Copy link
Contributor

Sorry one more clarification -- all these are just questions, not objections to this concept, and some may be based on misconceptions I have about interface-go-ipfs-core or other broader concepts in the IPFS system. It feels a bit above my pay grade to try to figure this stuff out -- so I'm mainly trying to offer feedback as best I can.

Copy link
Member

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Some random comments)

I'd limit context use to cancelling requests. For anything that requires more complex closing logic (like flushing buffers, which needs to keep blockstore open), I'd use https://github.com/jbenet/goprocess

// we depend on data from the coreapi to initalize our API nodes
// so fetch it or something and store it on the FS
daemon := fallbackApi()
ctx := deriveCtx(daemon.ctx) // if the daemon cancels, so should we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you'll be constructing fallback api, you'll control the context. https://github.com/ipfs/interface-go-ipfs-core/issues/32

{"/ipns/", nameAPIParser},
{filesRootPrefix, filesAPIParser},
} {
closer, err := pkgRoot.Register(pair.string, pair.ParseFn)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to adopt some unix terminology - pkgRoot.Mount?

func NewDefaultFileSystem(parentCtx context.Context) (FileSystem, error) {
// something like this
fsCtx := deriveFrom(parentCtx)
// go onCancel(fsCtx) { callClosers() } ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use goprocess instead of contexts since it allows for much nicer closing logic (and we already use it for quite a few things)

corepath.Path
InitMetadata(context.Context) (FileMetadata, error)
Metadata() FileMetadata
//RWLocker Maybe?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that eventually this may need to work over http api

@djdv djdv closed this May 11, 2019
@djdv djdv reopened this May 11, 2019
@djdv
Copy link
Author

djdv commented May 11, 2019

Premature post because hotkeys, ignore that^

@hannahhoward
Much appreciated!

naming is key here

Agreed. Team members can attest to my VERY GOOD pre-draft explanations.
"well to index the index results in an index which is an index compatible with either index"

Am I getting it right?

It seems so! Generally trying to construct a file system interface from the technologies we have, unified under a common overarching interface.

Before we add all these APIs to the already large core interface surface -- have we identified the concrete cases where a user of IPFS would specifically like to interact with IPFS this way, other than in backing FUSE?

In general, we share the appeal of the CoreAPI for Go developers. Exposing multiple IPFS operations, through this unified interface. Wrapping the CoreAPI and others internally, to expose IPFS as a filesystem, usable by the Go runtime.

Making snippets similar to this, legal. Without developers needing to know specific/new conventions for specific/new APIs.

func client() {
	f, err := fs.OpenFile("/ipfs/Qm.../file.ext", flags, perm)
	if err != nil {
		//...
	}
	defer f.Close()
	x, y := f.Read(someByteSlice)
}

at the command line, over HTTP, via gRPC eventually
needs to land in javascript as well?
Your discussions discuss specifically interacting at the Go API level. May all this should just be a package of its own?

A self contained package was certainly the original intent, however it's seemingly useful to expose this interface (only) at the language level if we can.
For an analogy, we can say that this interface is to Go, what the POSIX file system specification is to C.
We define how the fs works, a Go programmer picks an implementation, and then uses it in a way that seems native to their programming language.
In theory, the JS team would have to create something similar, exposing multiple IPFS apis as a file system that fits naturally with their language, but looks more or less the same. More thought needs to be put into cross project though.

I would consider introducing the word "Generic" in front of these interface names, or some other term to indicate this is a a meta file system interface that operates mostly on top of UnixFs/MFS/etc.

Agreed. I'm not sure what yet but these names need work. It's unfortunately really easy to fall into stuttering patterns. fs.FsFileNodeFile, "My file system implements the interplanetary file system filesystem interface interface", etc.
This will hopefully improve with the rest of the terms being refined and systems being further dived/refined.

--
In regards to the tail of the post, this is something I need to work on demonstrating somehow.
I'll be posting offline context that may relate, but I'm going to address this again later. But for brief context, the general idea is that we want an abstract filesystem root that is utilized by Go programs, which can also mean go-ipfs itself.
Putting this into perspective, it would be possible to have ls /mount/ipfs-mount and ipfs ls / have the same results. While also having that conceptual / be alive/dynmic. If you give a go program access to the index, it could register /something new and as long as it conforms to a directory, ipfs ls would be able to list it, even if we didn't define it and didn't register it ourself.
This may be more contrived than practically useful though.

@djdv
Copy link
Author

djdv commented May 11, 2019

Offline feedback note (unrelated to above):
In regard to why we are using context values the way we are.

This is not entirley thought out yet. There may be better ways of handling it, but I'll demonstrate how they're used in concept, and why we need a way of passing arbitrary data from the request to the node.

If you wanted to implement a filesystem that supported "symlink creation" as a concept, your package would look something like this:

package myFs
import fs "core/filesystem"
var root = overlay(fs.BaseRoot) // derive from an Fs that already has standard conventions built-in

CreateLink(name, target ) {
	/* Here, the FS package is (only) responsible for defining its (own) operation specifications

	We have decided that for `myFs`, this operation succeeds if:
	- `name` does not exist
	- `target` exist 
	- The IO operation itself actually succeeds
	otherwise we return a,b,c for scenarios x, y, z

	The implementation follows:
	*/

	// To start, we want to get an abstract type "FsNode"
	// which points to a filesystem location that may or may not exist
	fsNode, err := root.Lookup(name)
	switch (err) {
		default:
			log(err); return err
		nil:
			log(...); return fs.ErrExist
		case fs.ErrNoExist:
			break			
	}

	// check the target
	targetNode, err := root.Lookup(target)
	if err != nil {
		log(err); return wrap(err)
	}

	/* We have now determined that this is a valid request,
	and would like to relay it to the API the name belongs to

	If the name resides in a namespace where symlinks are stored as a unixfs type
	we'd normally have to make unixfs requests here
	However, `Lookup` returned a node to us
	that already implements an interface between us and the node

	All that's needed from us, is the type+request specific values required by the node
	(Which ultimately make it to the unixfs (api-level), which then get turned into go-merkledag ProtoNodes (format-level)
	*/

	callCtx : = context.WithValue(ctx, fs.TargetKey, target)
	err := fsNode.Create(callCtx, name, unixfs.TSymlink)
	if err != nil {
		return x
	}
	return y
}

On the consumer side:

import myFs

myFs.CreateLink("/namespace1/link", "/somewhere/target")

Ultimately, we just need to be able to define and enforce filesystem behaviour, by way of processing client requests, and orchestrating abstracts nodes with request values we have access to, but the node may not (but still requires).

Something like this:
filesystem.Create($NAME, $TYPE) => node.Create($REQCTX, $TYPE) => someApi.SpecificFunction(name, apiSpecificValue1FromReq, apiSpecificValue2FromReq,)

This requires us to interface with the node in a way that crosses api and format boundaries.
At the moment, this is simply passing in expected values through the context, with keys defined by the node implementation package. But could likely be handled another way which may be preferable.

Implementing new nodes and adding them in your own filesystem would look like this:

package myNode

const P2Ppipe FsType = 123456
type PeerKey struct{}

type Pipe struct {...}
func (p *Pipe) Pipe Create(ctx context, name string, type FsType) {
	//...
	if p.handle != nil { return error }

	friend, := ctx.Value(PeerKey{})
	p.handle, err = libp2p.connectTo(friend)
	//...
}
func (p *Pipe) Stat(){...}
package myFs
import myNode

CreatePipe(name, friend libp2p.ID) {
	//...
	callCtx : = context.WithValue(ctx, myNode.PeerKey, friend)
	myFs.Create(callCtx, name, myNode.P2Ppipe)
	//...
}

This may be a bad way of handling it though.
Needs more consideration.

@lidel
Copy link
Member

lidel commented May 13, 2019

🚗 💨 apologies for a drive-by comment

[..] have we identified the concrete cases where a user of IPFS would specifically like to interact with IPFS this way, other than in backing FUSE?

Not sure how useful it is, but here is a potential data point: We've been thinking about adding WebDAV support (ipfs/in-web-browsers#146) at some point. At high level it is similar to FUSE or any other FS mapper (verbs include COPY, MOVE, MKCOL, LOCK, UNLOCK). I suspect its future implementation could use filesystem interface proposed here.

In theory, the JS team would have to create something similar, exposing multiple IPFS apis as a file system that fits naturally with their language, but looks more or less the same. More thought needs to be put into cross project though.

Yup, in JS land this type of shim/polyfill is quite popular (e.g. iso-stream-http acts as Node's http in the browser). I see how our JS team could create "js-ipfs-node-fs" to deliver API compatible with Node's fs.

Just like you hinted, this type of abstraction feels language-specific. I am not sure how well it plays with an idea of extending Core API and exposing golang-first APIs over HTTP API.
Perhaps it should be moved to a standalone lib (similar to spf13/afero) instead?

@djdv djdv force-pushed the feat/filesystem branch 2 times, most recently from e335def to 649eb3c Compare June 13, 2019 17:59
@djdv djdv force-pushed the feat/filesystem branch from 649eb3c to 9f28d54 Compare June 29, 2019 12:53
@djdv djdv closed this Nov 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants