-
Notifications
You must be signed in to change notification settings - Fork 677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
layer: Require ustar (originally from IEEE Std 1003.1-1988 but linking IEEE Std 1003.1-2013) #342
Conversation
Trevor, this is a bit excessive. Honestly i'm inclined to just close this. |
On Sat, Sep 24, 2016 at 05:44:38AM -0700, Vincent Batts wrote:
Would it be less excessive if I moved the bar from pax (IEEE Std |
I do like the spirit of this because going down the path of defining this all in-spec (e.g. #317, #336) is a an awful rabbit hole (inventing our own archive format), and I'd rather defer to external standards wherever possible. Unfortunately pax is not super well known or popular so I fear we risk a lot of confusion by changing the references everywhere like this. Can we still generally talk about tar but then just define more specifically what we mean by it? (most modern tar implementations support pax right?) |
The idea with a spec like this is to define behavior so that different implementations can interoperate reliably. When there is an interop problem betweem implementations A and B, it should be clear from the spec whether implementation A is broken, implementation B is broken, or the spec is insufficiently clear. For example, pax defines 'g' and 'x' typeflags [1] that aren't part of the older ustar (originally defined in IEEE Std 1003.1-1988) [2]. Before this commit, if implementation A produced a layer with a 'g' or 'x' typeflag and implementation B died unpacking it, it was unclear whose fault it was. With this commit, it is clearly A's fault (because it is using features not defined for ustar). If implementation A had produces a layer with an '2' typeflag (which ustar specifies for symlinks) and implementation B died unpacking it, it is B's fault. If implementation A had produces a layer with an 'S' typeflag (which GNU uses for sparse files [3]) and implementation B died unpacking it, it is neither party's fault, because ustar explicitly makes those values implementation-defined. Interop around them is up to out of band communication between the layer author and layer consumer, and is not covered by this spec. The previous "File Types" section listed sockets, but the ustar spec has: Attempts to archive a socket using ustar interchange format shall produce a diagnostic message. And I see no socket entry in Go's set of typeflag constants [4], so I'm not sure how they were supported before. Go has supported pax since v1.1 [5,6], and pax lets you do things (like having symlink targets longer than 100 characters). But we're avoiding requiring support for PAX because of name-recognition issues [7]. [1]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_02 [2]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#POSIX_ustar_Archives [3]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#gnu-tar-archives [4]: https://golang.org/pkg/archive/tar/#pkg-constants [5]: https://codereview.appspot.com/6700047 [6]: golang/go@1068279 [7]: opencontainers#342 (comment) Signed-off-by: W. Trevor King <wking@tremily.us>
On Mon, Sep 26, 2016 at 02:09:44AM -0700, Jonathan Boulle wrote:
If we want to require pax support, I can put in links from anywhere
If we're requiring pax support, I'd rather call it pax. It seems like
I don't know how many tar implementations there are, but libarchive |
This is way out in left field. No one know what any of this language means and it makes the specification impossible to read. What this is called is affectation:
|
On Mon, Sep 26, 2016 at 11:41:45AM -0700, Stephen Day wrote:
Can you point out a particular bit of the language I'm adding which I agree with @jonboulle that when external specs defining ustar (or |
I never heard about ustar in my entire life |
The entire thing is an example of affectation, even the comment that you responded with. The links you keep adding everywhere don't help to provide context or understanding. They are just a way to make your points look intelligent "'cause references". The fact is, no one knows what pax or ustar is (nor are these very well written specs). They know what tar is and there are several implementations. The stuff you are linking to are attempts at standards but most people defer to the implementations in GNU and BSD. Writing a specification like this doesn't add clarity, it just adds volume. Help your reader, don't drown them. |
On Mon, Sep 26, 2016 at 12:04:41PM -0700, Antonio Murdaca wrote:
Have you ever written a tar specification, validator, or unpacker? |
On Mon, Sep 26, 2016 at 12:05:41PM -0700, Stephen Day wrote:
I think “the current GNU (or BSD) implementation” or even “the v1.14 |
On Mon, Sep 26, 2016 at 12:35:11PM -0700, Stephen Day wrote:
Nope. Which is why I want to stay out of that business ;). But given |
On 26/09/16 12:55 -0700, W. Trevor King wrote:
I've heard of it. And used it. As well know that many features that |
On Mon, Sep 26, 2016 at 01:08:19PM -0700, Vincent Batts wrote:
This is useful information to put in the spec. Which features are you |
@wking Linking out to a large document can often add more confusion than it provides. We need to be clear, in context, which portion of those specifications matter. For example, handling of sockets or pipes is probably not in scope and that needs to be clear this specification. |
On Mon, Sep 26, 2016 at 02:43:13PM -0700, Stephen Day wrote:
Do we? I expect all consumers will be using an off-the-shelf tar
I think handling FIFOs (aka “named pipes”) is important, and ustar Our current spec also claims (ish) support for sockets 1, but I |
The idea with a spec like this is to define behavior so that different implementations can interoperate reliably. When there is an interop problem betweem implementations A and B, it should be clear from the spec whether implementation A is broken, implementation B is broken, or the spec is insufficiently clear. For example, pax defines 'g' and 'x' typeflags [1] that aren't part of the older ustar (originally defined in IEEE Std 1003.1-1988) [2]. Before this commit, if implementation A produced a layer with a 'g' or 'x' typeflag and implementation B died unpacking it, it was unclear whose fault it was. With this commit, it is clearly A's fault (because it is using features not defined for ustar). If implementation A had produces a layer with an '2' typeflag (which ustar specifies for symlinks) and implementation B died unpacking it, it is B's fault. If implementation A had produces a layer with an 'S' typeflag (which GNU uses for sparse files [3]) and implementation B died unpacking it, it is neither party's fault, because ustar explicitly makes those values implementation-defined. Interop around them is up to out of band communication between the layer author and layer consumer, and is not covered by this spec. The previous "File Types" section listed sockets, but the ustar spec has: Attempts to archive a socket using ustar interchange format shall produce a diagnostic message. And I see no socket entry in Go's set of typeflag constants [4], so I'm not sure how they were supported before. Go has supported pax since v1.1 [5,6], and pax lets you do things (like having symlink targets longer than 100 characters). But we're avoiding requiring support for PAX because of name-recognition issues [7]. [1]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_02 [2]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#POSIX_ustar_Archives [3]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#gnu-tar-archives [4]: https://golang.org/pkg/archive/tar/#pkg-constants [5]: https://codereview.appspot.com/6700047 [6]: golang/go@1068279 [7]: opencontainers#342 (comment) Signed-off-by: W. Trevor King <wking@tremily.us>
Rebased around #316 and #336 with 93b5819 → 37ec118. There are no
changes in the lines I'm adding, but with #316 in place I'm no longer
adding the gzip requirement.
I'm dropping the hardlink discussion from #336, because the ustar spec
covers hardlinks without us having to explain them locally. The
discussion in #336 is more detailed than the ustar equivalent, but
folks have been able to write interoperable tar implementations
sharing hardlinks since 1988 based on the ustar spec, so I doubt we
need to add the additional background and implementation hints here.
On the other hand, I'm not strongly against keeping some of these if
folks do want to keep some or all of the hardlink tutorial, I'm fine
restoring it.
|
From experience, this is hardly the case. This spec needs to be clear about hardlinks in the context of layer files, which aren't covered in the 1988 specification. Remember, while these are indeed tar files, from the point of view of the specification, they represent the semantic concept of a filesystem changeset. One cannot simply unpack these on top of each other and get a useable result. The tar entries need to be interpreted and this specification instructs that interpretation. |
On Tue, Sep 27, 2016 at 01:32:32PM -0700, Stephen Day wrote:
I don't see anything that is layer-specific in the hardlink section
I think that's only because of whiteouts, which are orthogonal to |
@wking I'm going to document myself if I don't know something. But, my comment was referring to the fact that in my opinion you're trying to put stuff into this spec which aren't so common overall. The beauty of this spec (as well as the Docker spec) is its simplicity. By requiring all this other stuff you're trying to add everywhere you make that simplicity simply go away. This is just my opinion of course. |
On Wed, Sep 28, 2016 at 08:07:08AM -0700, Antonio Murdaca wrote:
I think ustar is common. Are you aware of a tar tool or library
The Wikipedia entry currently linked from the spec has ~3k words. The |
you probably didn't get what I said if you are still quoting stuff which I understand - so, fair enough, I'm out of this conversation :) |
On Wed, Sep 28, 2016 at 08:22:26AM -0700, Antonio Murdaca wrote:
I was confused, but @runcom clarified on IRC that he considers the The file structure to store this information was later standardized And I think this PR simplifies things by cutting straight to that spec It also makes it clear that ustar comatibility is only a requirement |
As an example of the sort of thing you can do after this change, I've However, without this change there are no grounds in the spec for |
and maybe it's not the right time to do this? or it's not necessary altogether? you can see it yourself, that change isn't required anywhere, the spec is still working even w/o it. it's you adding it and then making that change. |
On Thu, Sep 29, 2016 at 03:33:34AM -0700, Antonio Murdaca wrote:
It's hard for me to imagine a modern tar implementation not even
The current layer-publisher ecosystem is (as I understand it, again, |
I'm still calling this ridiculous and excessive. At best linking to some IEEE pay-walled doc. But for many, that is a loose link that they will not research the validity of. I'm likely going to close this issue. |
On Thu, Oct 06, 2016 at 10:48:24AM -0700, Vincent Batts wrote:
It currently links to a non-paywalled 2013 version of the ustar spec. |
This discussion is exactly what the Unix world used to be like before Posix: we just wanted our programs to run without having to make conditionals for every different distribution's way of doing every single trivial thing. Posix didn't standardize the Why not just say GNU tar? Meaning that layers in conforming images will be extractable by GNU tar without unusual options. If you really wanted to tie it down, you could specify a tar version (1.29 is current). GNU is already central to Linux and tolerated on Windows, so it's a good-enough choice. |
On Mon, Oct 17, 2016 at 04:08:12PM -0700, Adrian Colley wrote:
I'm ok with requiring GNU tar, although I'd definitely want to pin to In addition to entries describing archive members, an archive may is pretty open-ended. The questions are “are the GNU docs |
I'm closing this as docs currently reference GNU tar "standard" format, as well as having requirements for xattrs. This PR goes far enough that would be immediately incompatible with what folks are using and expecting. |
The idea with a spec like this is to define behavior so that different implementations can interoperate reliably. When there is an interop problem betweem implementations A and B, it should be clear from the spec whether implementation A is broken, implementation B is broken, or the spec is insufficiently clear.
For example, pax defines
g
andx
typeflags that aren't part of the older ustar (IEEE Std 1003.1-1988). Before this commit, if implementation A produced a layer with ag
orx
typeflag and implementation B died unpacking it, it was unclear whose fault it was. With this commit, it is clearly B's fault (because it does not understand the pax typeflag).If implementation A had produces a layer with an
S
typeflag (which GNU uses for sparse files) and implementation B died unpacking it, it is neither party's fault, because pax explicitly makes those values implementation-defined. Interop around them is up to out of band communication between the layer author and layer consumer, and is not covered by this spec.The previous “File Types” section listed sockets, but the pax spec has:
And I see no socket entry in Go's set of typeflag constants, so I'm not sure how they were supported before.
It looks like Go has supported pax since v1.1.
Spun off from discussion in #336. I don't really care what spec we require here. Requiring ustar (IEEE Std 1003.1-1988) support would be fine (a lower bar to implement, but does not support path names with more than 100 characters, etc.). Or requiring support for a number of specifications (e.g. you have to understand both pax and GNU records). But I think we should be requiring something, otherwise interop relies on unspecified, out-of-band agreements on layer format.