Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for SOURCE_DATE_EPOCH #2918

Merged
merged 6 commits into from
Oct 13, 2022
Merged

Conversation

tonistiigi
Copy link
Member

@tonistiigi tonistiigi commented Jun 21, 2022

addresses #1058

Allows reproducible timestamps for layer and image timestamps.

Implemented as a frontend-opt because in the future, the same option could be detected by frontend for custom behavior
and the same value should also apply timestamps for FileOps. Definition for SOURCE_DATE_EPOCH can be found in https://reproducible-builds.org/docs/source-date-epoch/

In local and tar exporter, setting the build-arg sets overwrites the timestamp for the output files.

There is also a secondary implementation inside the Dockerfile frontend. This allows getting reproducible image manifests even on old buildkit, by just defining #syntax to point to new Dockerfile image.

In Dockerfiles the build-arg can also be exposed to inner processes by defining ARG SOURCE_DATE_EPOCH inside the stage.

This is the first stop toward better reproducible builds support that can be followed up with:

  • Overwrite timestamp for files created with FileOp, eg. COPY/ADD in Dockerfiles dockerfile: add (ADD|COPY) --timestamp=<RFC3339Nano> #2911 @AkihiroSuda
  • Possibly allow setting timestamp from Dockerfile, eg. #epoch= . Frontend side of this is already implemented and the frontend can pass the epoch value to the exporter(it can not overwrite the value if it has already been set by the user).
  • Add some special case that allows using setting epoch to git commit timestamp when building from git. Currently need to do --build-arg SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)
  • Maybe allow configuring differ or overwriting files created by ExecOp

@tonistiigi tonistiigi force-pushed the source-date-epoch branch 2 times, most recently from eb72e03 to 7f2c1a1 Compare June 21, 2022 05:29
@crazy-max
Copy link
Member

@sudo-bmitch
Copy link

sudo-bmitch commented Jul 3, 2022

Can we pass this into the RUN steps without defining an ARG in the Dockerfile? I.e. the same way we do for http proxy today?

@tonistiigi
Copy link
Member Author

Can we pass this into the RUN steps without defining an ARG in the Dockerfile? I.e. the same way we do for http proxy today?

I think defining ARG should be required. This changes the behavior of the RUN and output of the created image so should be decision user makes. Proxy args are different because the expectation is that when two users are behind a different proxy they still generate the same image and proxy only affects their local network configuration and not the build result itself.

@sudo-bmitch
Copy link

sudo-bmitch commented Jul 14, 2022

I think defining ARG should be required. This changes the behavior of the RUN and output of the created image so should be decision user makes.

Since the user is running the build command, what if the forced ARG was added to the steps even if not defined in the Dockerfile (so it would show up in the history)? The goal I'm working towards is making a Dockerfile reproducible without needing to modify the Dockerfile itself.

@tonistiigi
Copy link
Member Author

It would, for example, mean that build-cache does not apply between builds that use epoch and ones that do not, even if the run steps make no use of the added argument. I don't think avoiding changes in Dockerfile is a goal (in fact, we are probably adding #epoch), especially if the behavior of the Dockerfile commands changes. Ultimately it is the Dockerfile author who knows if the process they use in RUN step understands the env variable or not.

control/control.go Outdated Show resolved Hide resolved
@@ -372,5 +372,6 @@ RUN --security=insecure cat /proc/self/status | grep CapEff
* `BUILDKIT_MULTI_PLATFORM=<bool>` opt into determnistic output regardless of multi-platform output or not
* `BUILDKIT_SANDBOX_HOSTNAME=<string>` set the hostname (default `buildkitsandbox`)
* `BUILDKIT_SYNTAX=<image>` set frontend image
* `SOURCE_DATE_EPOCH` set the UNIX timestamp for created image and layers. More info from [reproducible builds](https://reproducible-builds.org/docs/source-date-epoch/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to clarify the syntax version (1.5, unreleased)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@AkihiroSuda
Copy link
Member

LGTM but a couple of nits.
Also needs rebase.

@tonistiigi tonistiigi force-pushed the source-date-epoch branch 2 times, most recently from e78f77d to f9dc044 Compare August 5, 2022 05:29
@tonistiigi tonistiigi requested a review from AkihiroSuda August 5, 2022 05:33
Copy link
Member

@crazy-max crazy-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tonistiigi
Copy link
Member Author

@AkihiroSuda ok to merge?

@crazy-max
Copy link
Member

needs rebase, PTAL @AkihiroSuda

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after rebase

@thaJeztah
Copy link
Member

Looks like this needs a rebase again

@AkihiroSuda
Copy link
Member

OCI exporter needs to be fixed to reset "org.opencontainers.image.created" annotation

desc.Annotations[ocispecs.AnnotationCreated] = time.Now().UTC().Format(time.RFC3339)

@spacedub
Copy link

spacedub commented Oct 5, 2022

Is this going to apply as well for whiteout files?

@AkihiroSuda
Copy link
Member

And this timestamp is also non-reproducible

img := images.Image{
Target: *desc,
CreatedAt: time.Now(),
}

@AkihiroSuda
Copy link
Member

Is this going to apply as well for whiteout files?

Needs:

Allows reproducible timestamps for layer and image timestamps.

Implemented as a frontend-opt because in the future same
option could be detected by frontend for custom behavior
and same value should also apply timestamps for FileOps.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
For daemons that don’t support SOURCE_DATE_EPOCH on
the exporter we can use these overrides on the frontend
side.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

My comments above can be addressed later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants