Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing refs and CAS blobs inside archive files (tar, zip, …) #61

Open
wking opened this issue Oct 16, 2017 · 0 comments
Open

Accessing refs and CAS blobs inside archive files (tar, zip, …) #61

wking opened this issue Oct 16, 2017 · 0 comments

Comments

@wking
Copy link
Contributor

wking commented Oct 16, 2017

I was doing some initial work in this respect with casengine over the weekend (wking/casengine@252e064). What I have there is assigning a single file URI namespace based on a single filesystem directory or archive. That's obviously not going to scale well enough if a single well-known URI is referencing several archive files, and the consumer wants to access all of them. So, how to approach this? I see two possibilities:

a. Add an archive-file protocol to the ref- and CAS-engine registries (it can be a single protocol spec behind both entries) which says “here's an archive file. Once you retrieve it, treat it as a file URI namespace and use this child protocol inside”. For example:

{
  "protocol": "archive-file-uri-v1",
  "uri": "file:///var/lib/oci-image/app.tar.gz",
  "mediaType": "application/x-tar",
  "encoding": "gzip",
  "child": {
    "protocol": "oci-cas-template-v1",
    "uri": "file:///blobs/{algorithm}/{encoding}"
  }
}

The mediaType and encoding properties would be optional and unecessary for HTTP(S) URIs. Maybe optional for file URIs too, but providing them avoids the need for peek-inside client type detection. It's a bit unfortunate that we lean on tar here, because “tar” is a bit of a moving target (opencontainers/image-spec#342), and none of the options has a registered media type. Still application/x-tar is probably sufficient for these cases, because neither ref- nor CAS-engines need fancy file metadata, sparse files, or any of the other bits that are not supported in older tars.

b. Don't use file URIs. Insead, use fragments. A number of media types define these (text/plain, text/csv, application/pdf, application/xml, and possibly others), but application/zip does not (as far as I can tell), and tar has no registered media type at all. In this case, we'd have to define our own (unregistered?) tar type, and our own (unregistered?) fragment syntax for both tar and zip (we could probably use the same filename-based fragment syntax for both types). But we wouldn't have to define the archive-file-uri-v1 wrapping protocol. A CAS-engine entry would look like:

{
  "protocol": "oci-cas-template-v1",
  "mediaType": "application/x-tar",
  "uri": "https://example.com/image/app.tar.gz#blobs/{algorithm}/{encoding}"
}

And again, contentType and encoding may be useful options for protocols using file URIs.

I think I like (b) more, but I'd be fine with either approach. Do others have preferences? Other approaches to handling this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant