Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random LABEL snp-multi-stage-id invalidates docker cache #1325

Open
ppiotrow opened this issue Apr 5, 2020 · 5 comments
Open

Random LABEL snp-multi-stage-id invalidates docker cache #1325

ppiotrow opened this issue Apr 5, 2020 · 5 comments
Labels

Comments

@ppiotrow
Copy link
Contributor

ppiotrow commented Apr 5, 2020

Expected behaviour

I'm currently writing second post about docker cache efficiency on SBT.
My main focus is to use docker cache in CI environments as much as possible. Already managed great improvements but there is problem with random generated Labels.

LABEL snp-multi-stage="intermediate"
LABEL snp-multi-stage-id="44857d33-aef2-4d80-b811-7d1ed9b1891d"

Executing second command always invalidates cache which is not always expected.
Especially when dockerAutoremoveMultiStageIntermediateImages := false is used.
I suggest to use something deterministic like stage0-(packageName in Docker).value

Actual behaviour

Step 1/20 : FROM repo.mycompany.com/team/openjre:8u242 as stage0
[info]  ---> a020ce624573
[info] Step 2/20 : LABEL snp-multi-stage="intermediate"
[info]  ---> Using cache
[info]  ---> 9abeed0b5a9f
[info] Step 3/20 : LABEL snp-multi-stage-id="44857d33-aef2-4d80-b811-7d1ed9b1891d"
[info]  ---> Running in 072dfdce55ad
[info] Removing intermediate container 072dfdce55ad
[info]  ---> 554d25368747
[info] Step 4/20 : WORKDIR /opt/my-app
[info]  ---> Running in 49f3e275dc8c

As you can see cache works with the first label, but gets invalidated after second, random label.
@mkurz What do you think about deterministic label?

@mkurz
Copy link
Member

mkurz commented Apr 5, 2020

@ppiotrow I think I can live with a deterministic label. Actually there was a discussion already if we should use a random id (like we do now) or something more deterministic. Please have a look the comment here and also my answer. As you can see my main argument was that I wanto to avoid any side effects if possible. E.g. creating an image fails and a user may want to inspect it later, however if you now run another build and that succeeds, with a deterministic label, it will also delete the previous build image, which we wanted to keep actually.
However, I think it would be a compromise to switch to a deterministic label for caching purposes if the win is much higher, performance and disc space wise. WDYT? Will it be worth it?

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Apr 6, 2020

I like the existing idea to have two layers: snp-multi-stage to wipe out all intermediate layers from sbt docker builds and second snp-multi-stage-id to handle only build specific image.
I don't really follow the argument of inspecting image later, but this is influenced by my environment. I usually run builds in docker in docker CI servers. Unpushed images are just gone to me. But if I run build locally, I'd inspect it just after it fails.

The caching capabilities, having reproducible (non random) builds, simpler unit tests is better from my point of view. I'd like to learn someone else with different CI setup opinion.

@mkurz
Copy link
Member

mkurz commented Apr 6, 2020

@ppiotrow Let's just change snp-multi-stage-id to something deterministic. I am fine with that. However I will not do that work, too busy right now.

@muuki88 muuki88 added the docker label Apr 6, 2020
@stoiev
Copy link

stoiev commented Aug 13, 2020

If you can live without those labels, it's possible to simply remove then as a workaround

dockerCommands := dockerCommands.value.filter {
  case Cmd("LABEL", args @ _*) => args.head.startsWith("snp-multi-stage")
  case _                       => true
}

@an-tex
Copy link

an-tex commented Aug 17, 2021

If you can live without those labels, it's possible to simply remove then as a workaround

dockerCommands := dockerCommands.value.filter {
  case Cmd("LABEL", args @ _*) => args.head.startsWith("snp-multi-stage")
  case _                       => true
}

Thanks for the workaround! There's just a negation missing, it should be:
case Cmd("LABEL", args @ _*) => !args.head.startsWith("snp-multi-stage")

In general I believe a deterministic id should be the default. More users are concerned with a fast build compared to ones inspecting their failed builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants