Skip to content

Commit

Permalink
Add --zero-file-timestamps flag
Browse files Browse the repository at this point in the history
This change adds a new flag to zero timestamps in layer tarballs without
making a fully reproducible image.

My use case for this is maintaining a large image with build tooling.
I have a multi-stage Dockerfile that generates an image containing
several toolchains for cross-compilation, with each toolchain being
prepared in a separate stage before being COPY'd into the final image.
This is a very large image, and while it's incredibly convenient for
development, making a change as simple as adding one new tool tends to
invalidate caches and force the devs to download another 10+ GB image.

If timestamps were removed from each layer, these images would be mostly
unchanged with each minor update, greatly reducing disk space needed for
keeping old versions around and time spent downloading updated images.

I wanted to use Kaniko's --reproducible flag to help with this, but ran
into issues with memory consumption (GoogleContainerTools#862) and build time (GoogleContainerTools#1960).
Additionally, I didn't really care about reproducibility - I mainly
cared about the layers having identical contents so Docker could skip
pulling and storing redundant layers from a registry.

This solution works around these problems by stripping out timestamps as
the layer tarballs are built. It removes the need for a separate
postprocessing step, and preserves image metadata so we can still see
when the image itself was built.

An alternative solution would be to use mutate.Time much like Kaniko
currently uses mutate.Canonical to implement --reproducible, but that
would not be a satisfactory solution for me until
[issue 1168](google/go-containerregistry#1168)
is addressed by go-containerregistry. Given my lack of Go experience, I
don't feel comfortable tackling that myself, and this seems like a
simple and useful workaround in the meantime.
  • Loading branch information
zx96 committed Apr 22, 2023
1 parent 24846d2 commit 3bbee81
Show file tree
Hide file tree
Showing 7 changed files with 50 additions and 7 deletions.
17 changes: 16 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ _If you are interested in contributing to kaniko, see
- [Flag `--target`](#flag---target)
- [Flag `--use-new-run`](#flag---use-new-run)
- [Flag `--verbosity`](#flag---verbosity)
- [Flag `--zero-file-timestamps`](#flag---zero-file-timestamps)
- [Flag `--ignore-var-run`](#flag---ignore-var-run)
- [Flag `--ignore-path`](#flag---ignore-path)
- [Flag `--image-fs-extract-retry`](#flag---image-fs-extract-retry)
Expand Down Expand Up @@ -975,7 +976,11 @@ are:
#### Flag `--reproducible`

Set this flag to strip timestamps out of the built image and make it
reproducible.
reproducible. This will also remove timestamps and environment-specific
data like the build machine from the image and layer metadata.

If you want to preserve image metadata but strip timestamps from layer
contents, see [`--zero-file-timestamps`](#flag---zero-file-timestamps).

#### Flag `--single-snapshot`

Expand Down Expand Up @@ -1043,6 +1048,16 @@ file system snapshots. In some cases, this may improve build performance by 75%.
Set this flag as `--verbosity=<panic|fatal|error|warn|info|debug|trace>` to set
the logging level. Defaults to `info`.

#### Flag `--zero-file-timestamps`

Set this flag to strip timestamps from the files inside each layer.
This will result in different image hashes, but can save space on disk
when images share identical layers, especially when combined with a
[multi-stage build](https://docs.docker.com/build/building/multi-stage/).

If you want a fully reproducible image, see
[`--reproducible`](#flag---reproducible).

#### Flag `--ignore-var-run`

Ignore /var/run when taking image snapshot. Set it to false to preserve
Expand Down
3 changes: 2 additions & 1 deletion cmd/executor/cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,8 @@ func addKanikoOptionsFlags() {
RootCmd.PersistentFlags().StringVarP(&opts.KanikoDir, "kaniko-dir", "", constants.DefaultKanikoPath, "Path to the kaniko directory, this takes precedence over the KANIKO_DIR environment variable.")
RootCmd.PersistentFlags().StringVarP(&opts.TarPath, "tar-path", "", "", "Path to save the image in as a tarball instead of pushing")
RootCmd.PersistentFlags().BoolVarP(&opts.SingleSnapshot, "single-snapshot", "", false, "Take a single snapshot at the end of the build.")
RootCmd.PersistentFlags().BoolVarP(&opts.Reproducible, "reproducible", "", false, "Strip timestamps out of the image to make it reproducible")
RootCmd.PersistentFlags().BoolVarP(&opts.Reproducible, "reproducible", "", false, "Strip all timestamps out of the image to make it reproducible")
RootCmd.PersistentFlags().BoolVarP(&opts.ZeroFileTimestamps, "zero-file-timestamps", "", false, "Strip file timestamps out of each layer to make layers reproducible")
RootCmd.PersistentFlags().StringVarP(&opts.Target, "target", "", "", "Set the target build stage to build")
RootCmd.PersistentFlags().BoolVarP(&opts.NoPush, "no-push", "", false, "Do not push the image to the registry")
RootCmd.PersistentFlags().BoolVarP(&opts.NoPushCache, "no-push-cache", "", false, "Do not push the cache layers to the registry")
Expand Down
1 change: 1 addition & 0 deletions pkg/config/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ type KanikoOptions struct {
ImageFSExtractRetry int
SingleSnapshot bool
Reproducible bool
ZeroFileTimestamps bool
NoPush bool
NoPushCache bool
Cache bool
Expand Down
2 changes: 2 additions & 0 deletions pkg/executor/build.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ var (
type cachePusher func(*config.KanikoOptions, string, string, string) error
type snapShotter interface {
Init() error
SetZeroTimestamps(bool)
TakeSnapshotFS() (string, error)
TakeSnapshot([]string, bool, bool) (string, error)
}
Expand Down Expand Up @@ -112,6 +113,7 @@ func newStageBuilder(args *dockerfile.BuildArgs, opts *config.KanikoOptions, sta
}
l := snapshot.NewLayeredMap(hasher)
snapshotter := snapshot.NewSnapshotter(l, config.RootDir)
snapshotter.SetZeroTimestamps(opts.ZeroFileTimestamps || opts.Reproducible)

digest, err := sourceImage.Digest()
if err != nil {
Expand Down
1 change: 1 addition & 0 deletions pkg/executor/fakes.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ type fakeSnapShotter struct {
}

func (f fakeSnapShotter) Init() error { return nil }
func (f fakeSnapShotter) SetZeroTimestamps(_ bool) { }
func (f fakeSnapShotter) TakeSnapshotFS() (string, error) {
return f.tarPath, nil
}
Expand Down
14 changes: 11 additions & 3 deletions pkg/snapshot/snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,10 @@ var snapshotPathPrefix = ""

// Snapshotter holds the root directory from which to take snapshots, and a list of snapshots taken
type Snapshotter struct {
l *LayeredMap
directory string
ignorelist []util.IgnoreListEntry
l *LayeredMap
directory string
ignorelist []util.IgnoreListEntry
zeroTimestamps bool
}

// NewSnapshotter creates a new snapshotter rooted at d
Expand All @@ -60,6 +61,11 @@ func (s *Snapshotter) Key() (string, error) {
return s.l.Key()
}

// SetZeroTimestamps changes whether snapshots will have file timstamps cleared
func (s *Snapshotter) SetZeroTimestamps(zeroTimestamps bool) {
s.zeroTimestamps = zeroTimestamps
}

// TakeSnapshot takes a snapshot of the specified files, avoiding directories in the ignorelist, and creates
// a tarball of the changed files. Return contents of the tarball, and whether or not any files were changed
func (s *Snapshotter) TakeSnapshot(files []string, shdCheckDelete bool, forceBuildMetadata bool) (string, error) {
Expand Down Expand Up @@ -112,6 +118,7 @@ func (s *Snapshotter) TakeSnapshot(files []string, shdCheckDelete bool, forceBui
}

t := util.NewTar(f)
t.SetZeroTimestamps(s.zeroTimestamps)
defer t.Close()
if err := writeToTar(t, filesToAdd, filesToWhiteout); err != nil {
return "", err
Expand All @@ -128,6 +135,7 @@ func (s *Snapshotter) TakeSnapshotFS() (string, error) {
}
defer f.Close()
t := util.NewTar(f)
t.SetZeroTimestamps(s.zeroTimestamps)
defer t.Close()

filesToAdd, filesToWhiteOut, err := s.scanFullFilesystem()
Expand Down
19 changes: 17 additions & 2 deletions pkg/util/tar_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import (
"path/filepath"
"strings"
"syscall"
"time"

"github.com/GoogleContainerTools/kaniko/pkg/config"
"github.com/docker/docker/pkg/archive"
Expand All @@ -38,8 +39,9 @@ import (

// Tar knows how to write files to a tar file.
type Tar struct {
hardlinks map[uint64]string
w *tar.Writer
hardlinks map[uint64]string
zeroTimestamps bool
w *tar.Writer
}

// NewTar will create an instance of Tar that can write files to the writer at f.
Expand Down Expand Up @@ -76,6 +78,11 @@ func (t *Tar) Close() {
t.w.Close()
}

// SetZeroTimestamps changes whether AddFileToTar will zero timestamps in the archive.
func (t *Tar) SetZeroTimestamps(zeroTimestamps bool) {
t.zeroTimestamps = zeroTimestamps
}

// AddFileToTar adds the file at path p to the tar
func (t *Tar) AddFileToTar(p string) error {
i, err := os.Lstat(p)
Expand Down Expand Up @@ -121,6 +128,14 @@ func (t *Tar) AddFileToTar(p string) error {
// use PAX format to preserve accurate mtime (match Docker behavior)
hdr.Format = tar.FormatPAX

if t.zeroTimestamps {
// clear atime, ctime, and mtime
epoch := time.Unix(0, 0)
hdr.AccessTime = epoch
hdr.ChangeTime = epoch
hdr.ModTime = epoch
}

hardlink, linkDst := t.checkHardlink(p, i)
if hardlink {
hdr.Linkname = linkDst
Expand Down

0 comments on commit 3bbee81

Please sign in to comment.