Skip to content

Commit

Permalink
layer: revamp for readability
Browse files Browse the repository at this point in the history
When adding verbiage for restriction on duplicate entries in the tar
archive, it was not immediately clear where this information would be
organized.

This revamp seeks to make the document on the whole more suitable for
implementors and be interpretable for compliance.

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
  • Loading branch information
vbatts committed Sep 6, 2016
1 parent 4f6b5fd commit 728e772
Showing 1 changed file with 131 additions and 33 deletions.
164 changes: 131 additions & 33 deletions layer.md
Original file line number Diff line number Diff line change
@@ -1,63 +1,129 @@
# Creating an Image Filesystem Changeset
# Filesystem Layer Changeset

An example of creating an Image Filesystem Changeset follows.
A layer changeset is the distributable difference between filesystems.
A layer changeset is not directly extracted, but applied to a filesystem.

An image root filesystem is first created as an empty directory.
Here is the initial empty directory structure for a changeset using the randomly-generated directory name `c3167915dc9d` ([actual layer DiffIDs are generated based on the content](#id_desc)).
## Distributable Format

Layer Changesets for the [mediatype](./media-types.md) `application/vnd.oci.image.layer.tar+gzip` MUST be packaged in [tar archive][tar-archive].
Layer Changesets for the [mediatype](./media-types.md) `application/vnd.oci.image.layer.tar+gzip` MUST NOT include duplicate entries for file paths in the resulting [tar archive][tar-archive].

## Change Types

Types of changes that can occur in a changeset are:

* Additions
* Modifications
* Removals

Additions and Modifications are reflected the same in the changeset tar archive.

Removals are reflected using "[whiteout](#whiteouts)" file entries (See [Representing Changes](#representing-changes)).

### File Attributes

File attributes that are provided for Additions and Modifications include:

* Modification Time (`mtime`)
* User ID (`uid`)
* User Name (`uname`) is secondary to `uid`
* Group ID (`gid `)
* Group Name (`gname`) is secondary to `gid`
* Mode (`mode`)
* Extended Attributes (`xattrs`)

[Sparse files](https://en.wikipedia.org/wiki/Sparse_file) SHOULD NOT be used.
Due to lack of consistent support will appear to mutate regular files on creating or applying of a layer changeset.

## Creating

### Initial Root Filesystem

An image root filesystem has an initial state as an empty directory.
The name of the directory is not relevant to the layer itself, only for the purpose of producing comparisons.

Here is an initial empty directory structure for a changeset, with a unique directory name `rootfs-c9d-v1`.

```
c3167915dc9d/
rootfs-c9d-v1/
```

### Populate Initial Filesystem

Files and directories are then created:

```
c3167915dc9d/
rootfs-c9d-v1/
etc/
my-app-config
bin/
my-app-binary
my-app-tools
```

The `c3167915dc9d` directory is then committed as a plain Tar archive with entries for the following files:
The `rootfs-c9d-v1` directory is then committed as a plain [tar archive][tar-archive] with relative path to `rootfs-c9d-v1`.
Entries for the following files:

```
etc/my-app-config
bin/my-app-binary
bin/my-app-tools
./
./etc/
./etc/my-app-config
./bin/
./bin/my-app-binary
./bin/my-app-tools
```

To make changes to the filesystem of this container image, create a new directory, such as `f60c56784b83`, and initialize it with a snapshot of the parent image's root filesystem, so that the directory is identical to that of `c3167915dc9d`.
NOTE: a copy-on-write or union filesystem can make this very efficient:
### Populate a Comparison Filesystem

Create a new directory and initialize it with an identical copy or snapshot of the prior root filesystem.
Example commands that can preserve file attributes to make this copy are:
* [cp(1)](http://linux.die.net/man/1/cp): `cp -pav rootfs-c9d-v1/ rootfs-c9d-v1.s1/`
* [rsync(1)](http://linux.die.net/man/1/rsync): `rsync -avHAX rootfs-c9d-v1/ rootfs-c9d-v1.s1/`
* [tar(1)](http://linux.die.net/man/1/tar): `mkdir rootfs-c9d-v1.s1 && tar --acls --xattrs -C rootfs-c9d-v1/ -c . | tar -C rootfs-c9d-v1.s1/ --acls --xattrs -x` (including `--selinux` where supported)

Any changes to the snapshot MUST NOT change or affect the directory it was copied from.

For example `rootfs-c9d-v1.s1` is an identical snapshot of `rootfs-c9d-v1`.
In this way `rootfs-c9d-v1.s1` is prepared for updates and alterations.

**NOTE**: *a copy-on-write or union filesystem can efficiently make directory snapshots*

Initial layout of the snapshot:

```
f60c56784b83/
rootfs-c9d-v1.s1/
etc/
my-app-config
bin/
my-app-binary
my-app-tools
```

This example change is going to add a configuration directory at `/etc/my-app.d` which contains a default config file.
There's also a change to the `my-app-tools` binary to handle the config layout change.
The `f60c56784b83` directory then looks like this:
Changes include new files, file and path removals, and updates to file attributes (mtime, uid, gid, mode, xattrs, etc.).

For example, add a directory at `/etc/my-app.d` containing a default config file, removing the existing config file.
Also a change to the `./bin/my-app-tools` binary to handle the config layout change.

Following these changes, the representation of the `rootfs-c9d-v1.s1` directory:

```
f60c56784b83/
rootfs-c9d-v1.s1/
etc/
.wh.my-app-config
my-app.d/
default.cfg
bin/
my-app-binary
my-app-tools
```

This reflects the removal of `/etc/my-app-config` and creation of a file and directory at `/etc/my-app.d/default.cfg`.
`/bin/my-app-tools` has also been replaced with an updated version.
Before committing this directory to a changeset, because it has a parent image, it is first compared with the directory tree of the parent snapshot, `f60c56784b83`, looking for files and directories that have been added, modified, or removed.
### Determining Changes

When two directories are compared, the relative root is the top-level directory.
The directories are compared, looking for files that have been added, modified, or removed.
Where "files" includes regular files, directories, sockets, symbolic links, block devices, character devices and FIFOs.

For this example, `rootfs-c9d-v1/` and `rootfs-c9d-v1.s1/` are recursively compared, each as relative root path.

The following changeset is found:

```
Expand All @@ -66,23 +132,51 @@ Modified: /bin/my-app-tools
Deleted: /etc/my-app-config
```

A Tar Archive is then created which contains *only* this changeset:
This reflects the removal of `/etc/my-app-config` and creation of a file and directory at `/etc/my-app.d/default.cfg`.
`/bin/my-app-tools` has also been replaced with an updated version.

- Added and modified files and directories in their entirety
- Deleted files or directory marked with a whiteout file
### Representing Changes

A whiteout file is an empty file that prefixes the deleted paths basename `.wh.`.
When a whiteout is found in the upper changeset of a filesystem, any matching name in the lower changeset is ignored, and the whiteout itself is also hidden.
As files prefixed with `.wh.` are special whiteout tombstones it is not possible to create a filesystem which has a file or directory with a name beginning with `.wh.`.
A [tar archive][tar-archive] is then created which contains *only* this changeset:

The resulting Tar archive for `f60c56784b83` has the following entries:
- Added and modified files and directories in their entirety
- Deleted files or directories marked with a [whiteout file](#whiteouts)

The resulting tar archive for `rootfs-c9d-v1.s1` has the following entries:

```
/etc/my-app.d/default.cfg
/bin/my-app-tools
/etc/.wh.my-app-config
./etc/my-app.d/
./etc/my-app.d/default.cfg
./bin/my-app-tools
./etc/.wh.my-app-config
```

## Applying

Layer Changesets of [mediatype](./media-types.md) `application/vnd.oci.image.layer.tar+gzip` are applied rather than strictly extracted in normal fashion for tar archives.

Applying a layer changeset requires consideration for the [whiteout](#whiteouts) files.
In the absence of any [whiteout](#whiteouts) files in a layer changeset, the archive is extracted like a regular tar archive.


### Changeset over existing files

This section covers applying an entry in a layer changeset, if the file path already exists.

If the file path is a directory, then the existing path just has it's attribute set from the layer changeset for that filepath.
If the file path is any other file type (regular file, FIFO, etc), then the:
* file path is unlinked (See [`unlink(2)`](http://linux.die.net/man/2/unlink))
* create the file
* If a regular file then content written.
* set attributes on the filepath

## Whiteouts

A whiteout file is an empty file with a special filename that signifies a path should be deleted.
A whiteout filename consists of the prefix .wh. plus the basename of the path to be deleted.
As files prefixed with `.wh.` are special whiteout markers, it is not possible to create a filesystem which has a file or directory with a name beginning with `.wh.`.
When a whiteout is found in the upper changeset of a filesystem, any matching name in the lower changeset is ignored, and the whiteout itself is also hidden.

Whiteout files MUST only apply to resources in lower layers.
Files that are present in the same layer as a whiteout file can only be hidden by whiteout files in subsequent layers.
The following is a base layer with several resources:
Expand Down Expand Up @@ -117,6 +211,8 @@ a/.wh..wh..opq

Implementations SHOULD generate layers such that the whiteout files appear before sibling directory entries.

### Opaque Whiteout

In addition to expressing that a single entry should be removed from a lower layer, layers may remove all of the children using an opaque whiteout entry.
An opaque whiteout entry is a file with the name `.wh..wh..opq` indicating that all siblings are hidden in the lower layer.
Let's take the following base layer as an example:
Expand All @@ -139,7 +235,7 @@ bin/
```

This is called _opaque whiteout_ format.
An _opaque whiteout_ file hides _all_ children of the `bin/` including sub-directories and all descendents.
An _opaque whiteout_ file hides _all_ children of the `bin/` including sub-directories and all descendants.
Using _explicit whiteout_ files, this would be equivalent to the following:

```
Expand All @@ -151,8 +247,10 @@ bin/

In this case, a unique whiteout file is generated for each entry.
If there were more children of `bin/` in the base layer, there would be an entry for each.
Note that this opaque file will apply to _all_ children, including sub-directories, other resources and all descendents.
Note that this opaque file will apply to _all_ children, including sub-directories, other resources and all descendants.

Implementations SHOULD generate layers using _explicit whiteout_ files, but MUST accept both.

Any given image is likely to be composed of several of these Image Filesystem Changeset tar archives.

[tar-archive]: https://en.wikipedia.org/wiki/Tar_(computing)

0 comments on commit 728e772

Please sign in to comment.