Any chance of changing the whiteout file approach? #24

timthelion · 2016-04-14T21:16:22Z

It seems unfortunate that a new standard should use a method which unecesarilly limits the standard. With the .wh file approach, base images can no longer contain arbitrary data. This means, for example, that you cannot have an image with example OCI image-spec data stored in it. Is there any possibility of changing this to use, for example, a white out list instead. So that the files that made a layer would be:

VERSION
layer.tar
whiteouts
json

That would mean that truely arbitrary data could be stored in the images, which would be really nice :)

The text was updated successfully, but these errors were encountered:

vbatts · 2016-04-14T21:23:10Z

To be clear, you are saying that the .wh approach is limited? And the
whiteout file is preferred?

If so, I agree.

On Thu, Apr 14, 2016, 17:16 Timothy Hobbs notifications@github.com wrote:

It seems unfortunate that a new standard should use a method which
unecesarilly limits the standard. With the .wh file approach, base images
can no longer contain arbitrary data. This means, for example, that you
cannot have an image with example OCI image-spec data stored in it. Is
there any possibility of changing this to use, for example, a white out
list instead. So that the files that made a layer would be:

VERSION
layer.tar
whiteouts
json

That would mean that truely arbitrary data could be stored in the images,
which would be really nice :)

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#24

timthelion · 2016-04-14T21:26:15Z

Creating a list of files to be whited out, and keeping that outside of layer.tar is better than putting .wh files in layer.tar. We wouldn't want a situation like this: http://git-annex.branchable.com/forum/Storing_git_repos_in_git-annex/

vbatts · 2016-04-14T21:35:29Z

I wholly agree and intend to see it done as a whiteout file list.

On Thu, Apr 14, 2016, 17:26 Timothy Hobbs notifications@github.com wrote:

Creating a list of files to be whited out, and keeping that outside of
layer.tar is better than putting .wh files in layer.tar. We wouldn't want
a situation like this:
http://git-annex.branchable.com/forum/Storing_git_repos_in_git-annex/

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#24 (comment)

philips · 2016-04-14T22:26:02Z

I think this is something we should think about for post v1.0.0. I whole heartedly agree but in the ideal case the initial v1 serialization spec is binary compatible with the existing docker serialization to save the sanity of all of the existing registry folks like acr, gcr.io, Quay, Hub, etc.

timthelion · 2016-04-14T22:48:04Z

Would it be possible to use the whiteout list file IFF a whiteout list file exists, and otherwise use the .wh approach?

philips · 2016-04-15T00:22:12Z

@timthelion Yes, that would be the right way of approaching it. It would be a schema bump which would be a version break. Happy to consider adding this feature to fix the issue but I do want to hold off until we get post v1.0.0 (in a couple of months).

timthelion · 2016-04-15T08:32:31Z

@philips so basically, you want to have the 1.0 release be supported by Docker/CoreOS/whatever, without actually haven't to change Docker/CoreOS/whatever's code? Basically, there will be no actual technical changes to the spec before 1.0?

cyphar · 2016-07-20T20:54:47Z

@timthelion No breaking technical changes.

wking · 2016-09-29T03:57:26Z

On Thu, Apr 14, 2016 at 02:26:16PM -0700, Timothy Hobbs wrote:

Creating a list of files to be whited out, and keeping that outside
of layer.tar is better than putting .wh files in layer.tar.

If we don't mind picking up a non-standard tar entry, star
(Schilling's tar) uses pax extension headers and defines
SCHILY.filetype with (among other things) a "whiteout" value
representing a BSD whiteout directory entry 1. And as far as I can
tell, that's the same sort of whiteout we're interested in. So if we
want a way to represent whiteouts without leaving tar or restricting
the legal filename space, that's probably a good choice.

aecolley · 2016-10-16T01:17:40Z

Now that v1.0.0-rc1 is out, it's the last chance to consider this before v1.0.0 is final. After that, this will be such an incompatible change that it will have to wait for v2.0.0 (assuming SemVer).

wking · 2016-10-16T02:28:07Z

You could add a new whiteout approach in 1.1. You'd only need to go to 2.0 if you remove or make backward-incompatible changes to an existing approach.

aecolley · 2016-10-17T22:30:50Z

A backwards-compatible change means either that a 1.1 image must be processed correctly by a 1.0 extractor, or that a 1.0 image must be processed correctly by a 1.1 extractor; depending on your point of view. In the case of 1.1-image-on-1.0-extractor, the extractor will not know about any way to produce a file named .wh.foo, regardless of how 1.1 decides to represent it. In the case of 1.0-image-on-1.1-extractor, the image cannot contain a layer with any .wh. file, because the 1.0 spec states unambiguously that there is no representation which can produce such a file.

Either way, it seems to me that it's impossible to construct an image which extracts a .wh.foo file into the unpacked bundle, unless either (a) both image and extractor are version 1.1, which is not backwards-compatible by definition; or (b) the image produces different bundle contents in 1.0 extractors and 1.1 extractors; which is an incompatibility all on its own.

Perhaps there's something I'm missing. Perhaps these limitations are acceptable to the project members. Otherwise, it's something that should be addressed before v1.0.0, IMHO.

vbatts · 2016-10-17T22:41:43Z

Interesting. There is an option of approaching whiteouts like overlayfs
does, by setting device to 0 on non-directories and and xattr for directory.

On Mon, Oct 17, 2016, 18:37 Adrian Colley notifications@github.com wrote:

A backwards-compatible change means either that a 1.1 image must be
processed correctly by a 1.0 extractor, or that a 1.0 image must be
processed correctly by a 1.1 extractor; depending on your point of view. In
the case of 1.1-image-on-1.0-extractor, the extractor will not know about
any way to produce a file named .wh.foo, regardless of how 1.1 decides to
represent it. In the case of 1.0-image-on-1.1-extractor, the image cannot
contain a layer with any .wh. file, because the 1.0 spec states
unambiguously that there is no representation which can produce such a file.

Either way, it seems to me that it's impossible to construct an image
which extracts a .wh.foo file into the unpacked bundle, unless either (a)
both image and extractor are version 1.1, which is not backwards-compatible
by definition; or (b) the image produces different bundle contents in 1.0
extractors and 1.1 extractors; which is an incompatibility all on its own.

Perhaps there's something I'm missing. Perhaps these limitations are
acceptable to the project members. Otherwise, it's something that should be
addressed before v1.0.0, IMHO.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#24 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEF6fCsS60XgtwhoZjJ5FVy8Go98VvXks5q0_cagaJpZM4IHyeg
.

aecolley · 2016-10-17T23:00:50Z

Unfortunately, support for extended attributes varies widely and incompatibly among tar implementations. And that gets us into the tarpit of the #342 discussion (pun accidental, I swear). But it works if the chosen tar format supports it.

wking · 2016-10-17T23:04:05Z

On Mon, Oct 17, 2016 at 03:30:51PM -0700, Adrian Colley wrote:

A backwards-compatible change means either that a 1.1 image must be
processed correctly by a 1.0 extractor, or that a 1.0 image must be
processed correctly by a 1.1 extractor; depending on your point of
view.

This should be made very clear in the spec, leaving it open to
interpretation is going to make make SemVer largely useless. In
runtime-spec, opencontainers/runtime-spec#523 made it clear that you
may need a new runtime after bumping your config's minor version.

In the case of 1.1-image-on-1.0-extractor, the extractor will not
know about any way to produce a file named .wh.foo, regardless of
how 1.1 decides to represent it.

Agreed, which is why I want a 1.0 extractor to error out if you give
it a 1.1 image.

In the case of 1.0-image-on-1.1-extractor, the image cannot contain
a layer with any .wh. file, because the 1.0 spec states
unambiguously that there is no representation which can produce such
a file.

Well, you can have ‘.wh.foo’ entries. They're just interpreted as
“remove foo” and not “please create a file (or directory, or …) at
.wh.foo”. If 1.1 declares a new media type
(e.g. application/vnd.oci.image.layer.v1.1.tar) supporting
SCHILY.filetype whiteout 1 or device 0 2 or some other way to
avoid the current path overloading, then 1.0 extractors will correctly
die (with “I've never heard of
application/vnd.oci.image.layer.v1.1.tar”) and 1.1 and later 1.x
extractors will correctly unpack the layer (including any .wh.foo
files it contains).

Either way, it seems to me that it's impossible to construct an
image which extracts a .wh.foo file into the unpacked bundle,
unless either (a) both image and extractor are version 1.1, which is
not backwards-compatible by definition; or (b) the image produces
different bundle contents in 1.0 extractors and 1.1 extractors;
which is an incompatibility all on its own.

Backwards-compat means “the old stuff still works with the new tools”.
So 1.0 images would still work with 1.1 tools. But if .wh.foo files
are impossible in 1.0 (which seems like the current path), then yeah,
no 1.0 images are going to have them. But if you want a .wh.foo file,
and are willing to create a 1.1 image and use a 1.1+ extractor, that
will work. You don't need to bump to 2.0 (and throw out all of your
other 1.0 images).

aecolley · 2016-10-17T23:14:23Z

@wking OK, I see your point. 1.0 images can work on 1.x extractors so long as they don't attempt to create non-whiteout .wh. files, which is fine. I withdraw my position.

stevvooe · 2016-10-19T23:47:26Z

After spending some more intimate time with whiteout files (.wh.), the approach currently employed in this specification is really as good as any.

What good does changing this provide? I don't see how this allows layers to have "arbitrary data".

wking · 2016-10-19T23:55:24Z

On Wed, Oct 19, 2016 at 04:47:27PM -0700, Stephen Day wrote:

After spending some more intimate time with whiteout files (.wh.),
the approach currently employed in this specification is really as
good as any.

What good does changing this provide? I don't see how this allows
layers to have "arbitrary data".

As @timthelion describes in the topic post 1, the current approach
makes it impossible to distribute .wh.* files because the entry path
is overloading “unpack me to here” and “delete the stuff there”. The
alternatives discussed here:

a. An external ‘whiteouts’ 1.
b. SCHILY.filetype set to whiteout 2.
c. Device set to 0 3.

all either give us an non-overloaded place to store the “delete the
stuff there” bit (a and b) or pick a location where the overloading is
less restrictive (c).

stevvooe · 2016-10-20T00:53:24Z

@wking Is there a realistic use case for distributing .wh. files, other than packing up a container runtime into an image?

Please, for the love of god, stop doing this.

wking · 2016-10-20T03:31:48Z

On Wed, Oct 19, 2016 at 05:53:25PM -0700, Stephen Day wrote:

@wking Is there a realistic use case for distributing .wh. files,
other than packing up a container runtime into an image?

I don't have a personal use case for it, but I would like to avoid
complication when explaining what folks can put into layers. And from
an implementation perspective both SCHILY.filetype set to whiteout and
device set to 0 would be very easy to implement in code that already
uses the .wh.* approach.

stevvooe · 2016-10-21T18:40:16Z

@wking I don't really get this: the proposed layout is not how images are laid out. Such a provision requires either having in-band whiteout or a container format, which complicates unpacking. A tar of a tar, while we do it in image layout, should be avoided.

If we use overlay style device, now do you implement devices with windows on NTFS? Sure, you can put them in the tar file, but what happens when they are unpacked? How do you encode these in other archive formats that don't have device support, like zip?

wking · 2016-10-21T22:14:02Z

On Fri, Oct 21, 2016 at 11:40:17AM -0700, Stephen Day wrote:

A tar of a tar, while we do it in image layout, should be avoided.

I agree, which is why I prefer the SCHILY.filetype whiteout approach or the device 0 approach. Both of those are in-band (like the current .wh.* approach), but SCHILY.filetype is not overloaded at all and device 0 is a less-restrictive overload.

If we use overlay style device…

I think you're talking about @VBatt's device 0 approach here.

… now do you implement devices with windows on NTFS? Sure, you can put them in the tar file, but what happens when they are unpacked?

If we land “device set to zero means whiteout” docs (for application/vnd.oci.image.layer.v1.tar or application/vnd.oci.image.layer.v1.1.tar), then an unpacker handling such a tarball will invoke the whiteout operation whenever it hits a device 0 entry. I don't see how the OS comes into it, since whiteouts are a cross-platform idea. Windows unpackers can still fail if they encounter a device 1 entry, etc.

The SCHILY.filetype-set-to-whiteout approach would also be cross-platform.

How do you encode these in other archive formats that don't have device support, like zip?

You don't, but that's not a big deal. We don't have an application/vnd.oci.image.layer.v1.zip format now, and I don't hear anyone calling for one. If there is a future need for zip-based layers, the authors of the zip-layer spec will need to figure out a scheme for marking whiteouts. Maybe they'll use .wh.*, and maybe they'll use something else, but I don't think the potential presence of a future zip-based format is a good reason to overload the path as a whiteout marker in tar.

wking · 2016-10-25T18:20:32Z

On Tue, Oct 25, 2016 at 11:12:15AM -0700, Timothy Hobbs wrote:

I don't know what the best way to change the spec is, but I personally think that it would include adding a directory to the tarbal with a whiteout list and any other information that we might want to add in the future, so as to make the spec extendable.

You can extend at any time by minting new media types. You don't have to add support for something like this now on the off-chance that we'll use it later.

That is that the / directory of the layer should include a /opencontainer-data directory.

This has the same path-restriction as the current .wh.* approach, although it limits the restriction to a single path. POSIX pax extended headers provide the same functionality without overloading the entry path. And SCHILY.filetype is one example of what you can do with those extended headers.

timthelion · 2016-10-25T18:45:06Z

Excelent. I of course don't want to restrict the possible paths at all,
and did not know that a better method was possible. Somehow, I did not
understand from your discussion that this SCHILY thing is a method of
hiding files inside the TAR that remain outside of the file tree.

On 10/25/2016 08:20 PM, W. Trevor King wrote:

On Tue, Oct 25, 2016 at 11:12:15AM -0700, Timothy Hobbs wrote:
I don't know what the best way to change the spec is, but I
personally think that it would include adding a directory to the
tarbal with a whiteout list and any other information that we
might want to add in the future, so as to make the spec extendable.
You can extend at any time by minting new media types. You don't have
to add support for something like this now on the off-chance that
we'll use it later.
That is that the |/| directory of the layer should include a
|/opencontainer-data| directory.
This has the same path-restriction as the current |.wh.*| approach,
although it limits the restriction to a single path. POSIX pax
extended
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_02
headers
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_03
provide the same functionality without overloading the entry path. And
|SCHILY.filetype|
http://cdrtools.sourceforge.net/private/man/star/star.4.html is one
example of what you can do with those extended headers.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#24 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABU7-MQwXRKptr18oOlfACQltXiCI2vrks5q3khzgaJpZM4IHyeg.

stevvooe · 2016-10-25T19:03:02Z

@timthelion Besides the contrived examples, do you have an extant proof that someone has actually run into a naming collision with whiteout files? I think there are problems in nested container scenarios but that can be solved with filesystem passthrough.

Look, I am not saying that AUFS whiteouts are ideal, but I thought the goal here was to define a container standard based on working systems. Especially, one that people actually use.

If we always focus on the limitations, we'll never realize the benefits.

aecolley · 2016-10-26T00:37:37Z

"It's already implemented" is literally the only good thing to say about the .wh. scheme, but it's still a powerful argument. Bumping the imageLayoutVersion can be used to signal a better-thought out scheme in the future. Anyone who really needs .wh. files in a 1.0.0 image can rename them into place as the first action in the container's runtime 😧.

I'm really only an outsider throwing peanuts from the gallery, but if I were dictator, I'd make the .wh. prefix overrideable by a new field in the image config, because that's a quick-to-implement change. But being stuck with .wh. for 1.0 is OK as long as we're not going to be stuck with it for the long term.

wking · 2016-10-26T15:41:57Z

On Tue, Oct 25, 2016 at 05:37:37PM -0700, Adrian Colley wrote:

Bumping the imageLayoutVersion can be used to signal a
better-thought out scheme in the future…

imageLayoutVersion has nothing to do with it. The thing we'd be
bumping is the layer media type (e.g. by minting a new
application/vnd.oci.image.layer.v1.1.tar 1). But your main point
(that we aren't stuck with this long term unless we plan to support
the v1.0 media types forever) stands.

vbatts · 2016-10-28T13:05:56Z

a couple points:

I agree that pinning this standard to the rejected AUFS' approach is silly. The only redeeming thing is that "that's the way it's done in docker". The .wh. is a silly approach. The lack of ordering still bugs me.
Having a whiteout list provided would be a decent alternative. Best to have it in the ordered list of "layers" or "references" to be applied. Because if it were just a file at the / root inside the tarball, than it would best be either the first entry or directly after an entry for /.
There is an approach taken by overlayfs, which is upstreamed. Detailed https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt. For directories the set an xattr, of trusted.overlay.opaque=y which is somewhat tailored. But for all other file types they use a trick of setting the entry as a character device to 0/0 major/minor device, which hides it. Somewhat clever. Similar enough to the AUFS whiteout approach, with none of the silly .wh. name prefix-ing.
While this talk of a SCHILY. approach is possible does not mean that it should be pursued. I personally am against the SCHILY. approach. It is a PAX header, but would require changes to virtually all tar implementations. Event the golang library would require some re-work to allow for setting arbitrary pax headers like a new SCHILY. type. Which would likely mean carrying/maintaining a fork of upstream stdlib. This is a none-starter.

wking · 2016-10-28T15:20:05Z

On Fri, Oct 28, 2016 at 06:05:57AM -0700, Vincent Batts wrote:

There is an approach taken by overlayfs, which is
upstreamed. Detailed
https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt. For
directories the set an xattr, of trusted.overlay.opaque=y which
is somewhat tailored. But for all other file types they use a
trick of setting the entry as a character device to 0/0
major/minor device, which hides it. Somewhat clever. Similar
enough to the AUFS whiteout approach, with none of the silly
.wh. name prefix-ing.

While this talk of a SCHILY. approach is possible does not mean
that it should be pursued. I personally am against the
SCHILY. approach. It is a PAX header, but would require changes
to virtually all tar implementations. Event the golang library
would require some re-work to allow for setting arbitrary pax
headers like a new SCHILY. type. Which would likely mean
carrying/maintaining a fork of upstream stdlib. This is a
none-starter.

I don't think we'd need to fork archive/tar to support SCHILY.filetype
whiteout. GNU's tar is already using pax Extended Headers for xattr
support:

$ zgrep TMPFS_XATTR /proc/config.gz
CONFIG_TMPFS_XATTR=y
$ mkdir a
$ sudo mount -t tmpfs none a
$ touch a/b
$ md5sum a/b
d41d8cd98f00b204e9800998ecf8427e a/b
$ sudo setfattr -n trusted.md5sum -v d41d8cd98f00b204e9800998ecf8427e a/b
$ tar --version
tar (GNU tar) 1.28
…
$ sudo tar -cf a.tar --xattrs a
$ strings a.tar | grep SCHILY
64 SCHILY.xattr.trusted.md5sum=d41d8cd98f00b204e9800998ecf8427e

With SCHILY.filetype we'd just be putting a different payload in that
typeflag x entry. I can mock up an example if that would help
demonstrate the point (or turn up any implementation issues I'm
missing ;).

wking · 2016-10-28T22:41:06Z

On Fri, Oct 28, 2016 at 08:20:06AM -0700, W. Trevor King wrote:

$ strings a.tar | grep SCHILY
64 SCHILY.xattr.trusted.md5sum=d41d8cd98f00b204e9800998ecf8427e

With SCHILY.filetype we'd just be putting a different payload in that
typeflag x entry. I can mock up an example if that would help
demonstrate the point (or turn up any implementation issues I'm
missing ;).

Looking into this more today, it turns out that there is an issue with
Go's stock archive/tar: Go currently consumes x typeflag entries
internally and throws out any extended headers it doesn't recognize
(golang/go#14472). I've suggested Go provide a way around that
limitation, but until that happens Go will ignore any extended headers
besides SCHILY.xattr.*. Overlayfs's trusted.overlay.opaque (which in
the pax extended header would be SCHILY.xattr.trusted.overlay.opaque)
avoids the current Go limitation.

I don't understand why overlayfs uses one approach (c0/0) for whiteout
files and another (trusted.overlay.opaque) for opaque directories. It
seems like they'd use trusted.overlay.whiteout or some such to follow
the xattr pattern. Maybe there is a performance benefit to avoiding
an xattr lookup for whiteouts, but opaque directories need to be
directories so they can hold upper-level content, and you can't have a
c0/0 directory ;). Unpacking a layer tarball is not something that
happens as frequently as overlayfs file access, so if the reason for
using c0/0 in overlayfs is performance, that reasoning may not apply
here. On the other hand, c0/0 works for overlayfs, and major 0 is
reserved (at least on Linux 1). POSIX doesn't have much to say
about major/minor 2, with the most detail coming from the ustar
spec. From ustar's typeflag docs 3:

3,4
Represent character special files and block special files
respectively. In this case the devmajor and devminor fields shall
contain information defining the device, the format of which is
unspecified by this volume of POSIX.1-2008. Implementations may
map the device specifications to their own local specification or
may ignore the entry.

So I'm ok with the following ways for specifying “this is a whiteout”:

a. Follow overlayfs into overloading c0/0, which likely works (and
matches an existing implementation) but has unclear namespacing and
portability.

b. Follow overlayfs into coining a new SCHILY.xattr.trusted
(SCHILY.xattr.trusted.oci.whiteout?). This has better namespacing
than (a) but means we're coining a new pax extended header string
(even though we're following an established “use a pax extended
header” approach).

c. Wait until Go fixes their pax extended header processing and use
the existing SCHILY.filetype whiteout.

cyphar · 2016-11-05T11:42:45Z

I was reading about the Tar archive format the other day, and I noticed that we could just use Typeflag for this. Just specify that 'W' indicates a whiteout path (A-Z is defined as fair-game for tar archives from what I understand). This also helps with ensuring that we don't have to make it clear that you have to add a directory as a "file" when whiting it out.

One thing I ran into with umoci is that I'm not entirely sure what I'm meant to do with parent directories and whiteouts. Is it required that I include all of the parent components of a path in a diff layer if I'm whiting out a file? Or can I also just include the entries which are specifically paths that have been removed / explicitly changed?

wking · 2016-11-05T14:52:52Z

On Sat, Nov 05, 2016 at 04:42:46AM -0700, Aleksa Sarai wrote:

I was reading about the Tar archive format the other day, and I
noticed that we could just use Typeflag for this. Just specify
that 'W' indicates a whiteout path…

This is the same approach as SCHILY.filetype whiteout, but without
using a pax extended header. I'd rather wait for golang/go#14472 and
use the pax extended header, to avoid accidentally misinterpreting
some other tar extension that has decided to use a ‘W’ type for
something else. Especially with #342 still in flight, leaving which
tar unpackers MUST recognize very unclear.

One thing I ran into with umoci is that I'm not entirely sure what
I'm meant to do with parent directories and whiteouts. Is it
required that I include all of the parent components of a path in a
diff layer if I'm whiting out a file? Or can I also just include the
entries which are specifically paths that have been removed /
explicitly changed?

Layers only need to contain the paths they're touching. If you want
to whiteout ./a/b/c but don't need to touch ./a or ./a/b, your tar can
have a single entry touching:

./a/b/c

This is covered by the “extracted like a regular tar archive” wording
1 and underlined (for the ./ case) in the text that landed via #408.

cyphar · 2016-11-11T12:35:29Z

I just ran into this issue with this commit: cyphar/umoci@2a421289266d. Basically if you call lstat on AUFS with a whiteout path you get an EPERM. Which means that minor ordering issues like that commit fixes become permission issues and bugs.

lstat(2) returns EPERM if you try to get information about a whiteout path. This is one of the reasons I think that matching the AUFS semantics is insane. Ref: opencontainers/image-spec#24 Signed-off-by: Aleksa Sarai <asarai@suse.com>

wking · 2017-08-25T23:43:01Z

On Sat, Nov 05, 2016 at 07:52:54AM -0700, W. Trevor King wrote: I'd rather wait for golang/go#14472 and use the pax extended header, to avoid accidentally misinterpreting some other tar extension that has decided to use a ‘W’ type for something else.

golang/go#14472 was fixed today, so once Go 1.10 is cut (in February) the SCHILY.filetype approach will be supportable using the Go stdlib. Collecting a new list of potential approaches (because there have been some new ones since [1]): a. An external ‘whiteouts’ [2]. b. SCHILY.filetype set to whiteout (like BSD) [3]. c. Device set to 0 (like overlayfs for non-directories) [4,5]. d. SCHILY.xattr.trusted.overlay.opaque set to y (like overlayfs for directories) [5]. e. Setting SCHILY.xattr.trusted.oci.whiteout (or OPENCONTAINER.whiteout, or whatever) to y in the same spirit as overlayfs but with an OCI-specific key [6]. f. Typeflag set to W [7]. I'd prefer not to do a, since it complicates unpacking if we need more information than the tar content itself. I'm fine with any of the other options, although b, c, and e seem the least likely to result in collisions with third-party pax extensions applying semantics other than “this is a whiteout entry”. [1]: #24 (comment) [2]: #24 (comment) [3]: #24 (comment) [4]: #24 (comment) [5]: #24 (comment) [6]: #24 (comment) [7]: #24 (comment)

cyphar · 2017-08-26T23:31:10Z

@wking I think that (b) or (e) are the two best options. (e) is the best from a "let's create a standard that you have to explicitly implement" perspective, but (b) is the best from a "let's do what other people do already" (I consider the BSDs to be a better source of inspiration in this area than Linux filesystem internals). As for (c) and (d), they feel like they'd either be inconsistent with overlayfs anyway (since overlayfs does both) and would likely be inconsistent with how a normal user of a tar parsing library would act. (a) feels clunky, and (f) feels like a misuse of Typeflag which is a very small namespace.

wking · 2017-08-27T05:23:02Z

On Sat, Aug 26, 2017 at 11:31:11PM +0000, Aleksa Sarai wrote: @wking I think that (b) or (e) are the two best options. (e) is the best from a "let's create a standard that you have to explicitly implement" perspective, but (b) is the best from a "let's do what other people do already" (I consider the BSDs to be a better source of inspiration in this area that Linux filesystem internals).

Sounds good to me. Does someone want to pick one of those so we can file a PR for a new layer media type that uses the chosen whiteout approach?

stevvooe · 2017-08-31T01:22:18Z

What problem are we trying to solve here? There is already millions of container images using this approach. Tweaking this won't actually make the format better; it just means that complaint implementations now need to handle two variations, instead of one.

If we are going to make an incompatible format, it would be better to move away from a layered diff approach to something tree oriented (like https://github.com/containerd/continuity).

timthelion · 2017-08-31T17:49:37Z

This conversation is going around in circles. @stevvooe, you keep on coming back and saying basically the same things. I am, however, now against any changes, we should never break a standard after 1.0. And 1.0 was a long time ago. Docker's 1.0 that is. When I'm discussing this with you @stevvooe, I feel like Docker Inc. doesn't really want OCI to be an independent project. I feel like you want a facade of independence, a facade of industry consensus, behind which you can be the puppet masters. It feels to me like this is more a game of obstructionism than a real technical discussion, and I don't think there's any point in me or anyone else fighting that. Your reasons, that there were millions of containers using the standard, were true before the standard even existed, before the name OCI was ever dreamed up, and the only reason that's true is because this IS the Docker standard. Nothing more. You're making me feel that, in reality, this is Docker's project, and the rest of us should either accept that fact, or pack up and go home.

…

On 08/31/2017 03:22 AM, Stephen Day wrote: What problem are we trying to solve here? There is already millions of container images using this approach. Tweaking this won't actually make the format better; it just means that complaint implementations now need to handle two variations, instead of one. If we are going to make an incompatible format, it would be better to move away from a layered diff approach to something tree oriented (like https://github.com/containerd/continuity). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#24 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABU7-OEws7wr63vhhFru9hdqppyfHuZwks5sdgrNgaJpZM4IHyeg>.

wking · 2017-08-31T20:16:12Z

On Thu, Aug 31, 2017 at 10:49:39AM -0700, Timothy Hobbs wrote: I am, however, now against any changes, we should never break a standard after 1.0. And 1.0 was a long time ago. Docker's 1.0 that is.

I don't think anyone is arguing for removing application/vnd.oci.image.layer.v1.tar and friends [1] which will continue to use the current .wh.* approach [2]. This is about specifying a new format that doesn't suffer from the .wh.* limitations [3,4]. And to @stevvooe's most recent point [5], I would be very happy with a Merkle tree that had (sub)file granularity. #577 is probably a more targetted issue for that. And however that happens, it's going to be a pretty large shift. Using any of [6] (SCHILY.filetype set to ‘whiteout’ or SCHILY.xattr.trusted.oci.whiteout set to ‘y’ in an application/vnd.oci.image.layer.v2.tar family of media types) would be a much smaller shift that is much easier to support. If we expect #577 to be addressed in the near future, then there's probably no point in a v2.tar family. But if we expect #577 to linger for years, then I think it's worth creating a v2.tar family to address the v1.tar limitations. [1]: https://github.com/opencontainers/image-spec/blob/v1.0.0/layer.md#image-layer-filesystem-changeset [2]: https://github.com/opencontainers/image-spec/blob/v1.0.0/layer.md#whiteouts [3]: #24 (comment) [4]: #24 (comment) [5]: #24 (comment) [6]: #24 (comment)

caniszczyk · 2017-08-31T21:01:56Z

@timthelion please keep things civil, we have a code of conduct that we expect our community to abide by: https://www.opencontainers.org/about/code-of-conduct

vbatts · 2018-10-11T18:53:04Z

This came up again recently at Container Camp UK in London. @cyphar and I were discussing what other options would be a migration path from the AUFS inherited approach. And as we also want to enable ourselves for a non-TAR future.

timthelion changed the title ~~Any chance of changing the whiteout file approach~~ Any chance of changing the whiteout file approach? Apr 14, 2016

philips added component/serialization spec kind/version break kind/design priority/Pmaybe labels Apr 14, 2016

philips added this to the post-v1.0.0 milestone Apr 14, 2016

cgwalters mentioned this issue Aug 5, 2016

[merged] Skip tests that use whiteouts under Docker/aufs ostreedev/ostree#437

Closed

wking mentioned this issue Sep 8, 2016

config: Clarify ociVersion covering the configuration <-> runtime API opencontainers/runtime-spec#523

Merged

This was referenced Oct 28, 2016

archive/tar: add support for arbitrary pax vendor extensions golang/go#14472

Closed

specs-go: clarify mediatypes #411

Merged

image-layout: clarification of oci-layout #434

Merged

cyphar mentioned this issue Nov 5, 2016

layerdiff: generates redundant whiteouts opencontainers/umoci#2

Closed

wking mentioned this issue Nov 7, 2016

name-assertion: Add a name-assertion type #445

Closed

wking mentioned this issue Aug 26, 2017

Clarify expected PAX headers for xattrs (LIBARCHIVE.xattr. vs. SCHILY.xattr.) #725

Open

wking mentioned this issue Dec 10, 2017

Distribution Specification opencontainers/tob#34

Closed

jonjohnsonjr mentioned this issue Nov 1, 2019

Whiteouts don't work with overlayfs google/crfs#40

Closed

eric-badger mentioned this issue Jan 15, 2021

YARN-10494 CLI tool for docker-to-squashfs conversion (pure Java). apache/hadoop#2513

Open

Any chance of changing the whiteout file approach? #24

Any chance of changing the whiteout file approach? #24

Comments

timthelion commented Apr 14, 2016

vbatts commented Apr 14, 2016

timthelion commented Apr 14, 2016

vbatts commented Apr 14, 2016

philips commented Apr 14, 2016

timthelion commented Apr 14, 2016 • edited Loading

philips commented Apr 15, 2016

timthelion commented Apr 15, 2016

cyphar commented Jul 20, 2016

wking commented Sep 29, 2016

aecolley commented Oct 16, 2016

wking commented Oct 16, 2016

aecolley commented Oct 17, 2016

vbatts commented Oct 17, 2016

aecolley commented Oct 17, 2016

wking commented Oct 17, 2016 • edited Loading

aecolley commented Oct 17, 2016

stevvooe commented Oct 19, 2016

wking commented Oct 19, 2016

stevvooe commented Oct 20, 2016

wking commented Oct 20, 2016

stevvooe commented Oct 21, 2016

wking commented Oct 21, 2016 • edited Loading

wking commented Oct 25, 2016

timthelion commented Oct 25, 2016

stevvooe commented Oct 25, 2016

aecolley commented Oct 26, 2016

wking commented Oct 26, 2016

vbatts commented Oct 28, 2016

wking commented Oct 28, 2016

wking commented Oct 28, 2016

cyphar commented Nov 5, 2016

wking commented Nov 5, 2016

cyphar commented Nov 11, 2016 • edited Loading

wking commented Aug 25, 2017 via email

cyphar commented Aug 26, 2017 • edited Loading

wking commented Aug 27, 2017 via email

stevvooe commented Aug 31, 2017

timthelion commented Aug 31, 2017 via email

wking commented Aug 31, 2017 via email • edited Loading

caniszczyk commented Aug 31, 2017

vbatts commented Oct 11, 2018

timthelion commented Apr 14, 2016 •

edited

Loading

wking commented Oct 17, 2016 •

edited

Loading

wking commented Oct 21, 2016 •

edited

Loading

cyphar commented Nov 11, 2016 •

edited

Loading

cyphar commented Aug 26, 2017 •

edited

Loading

wking commented Aug 31, 2017 via email •

edited

Loading