Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching does not work with ARG inside the Dockerfile #1910

Closed
apollo13 opened this issue Feb 2, 2022 · 7 comments · Fixed by #1915
Closed

Caching does not work with ARG inside the Dockerfile #1910

apollo13 opened this issue Feb 2, 2022 · 7 comments · Fixed by #1915

Comments

@apollo13
Copy link
Contributor

apollo13 commented Feb 2, 2022

Note to maintainers: Attached a patch in #1915 -- not sure how to test that though, hints welcome.

Actual behavior
I am running kaniko with --cache-copy-layers=true and it fails to utilize the cache in certain situations.

Expected behavior
I expect kaniko to reuse the cache, it seems as if the cache ids are generated wrongly

To Reproduce

  1. Create the following dockerfile:
FROM alpine as build
ARG TEST=123
RUN apk add --no-cache tar
  1. Run the following command multiple times and observe the cache not being used:
docker run -v `pwd`:/workspace gcr.io/kaniko-project/executor:bde904349e9b7797d14f12c7a2657bf79145d18e-debug --dockerfile /workspace/Dockerfile --context dir:///workspace/ --destination 10.0.0.6:5000/test --cache=true --cache-copy-layers=true

Additional Information

  • Dockerfile: See above
  • Build Context: empty aside from the Dockerfile
  • Kaniko Image (fully qualified with digest) gcr.io/kaniko-project/executor:bde904349e9b7797d14f12c7a2657bf79145d18e-debug (a commit from yesterday)

Looking at the debug output the cache key changes:

DEBU[0000] Skip resolving path /cache                   
DEBU[0001] No file found for cache key sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 stat /cache/sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3: no such file or directory 
DEBU[0001] Image alpine not found in cache              
INFO[0001] Returning cached image manifest              
INFO[0002] Returning cached image manifest              
DEBU[0002] No file found for cache key sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 stat /cache/sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3: no such file or directory 
DEBU[0002] Image alpine not found in cache              
INFO[0002] Returning cached image manifest              
DEBU[0002] optimize: composite key for command ARG TEST=123 {[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG TEST=123]} 
DEBU[0002] optimize: cache key for command ARG TEST=123 186ec9fee220d1452a3af884e27fbb20dd3ef21eca7b68aaa316b058359fe73d 
DEBU[0002] optimize: composite key for command RUN apk add --no-cache tar {[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG TEST=123 |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 RUN apk add --no-cache tar]} 
DEBU[0002] optimize: cache key for command RUN apk add --no-cache tar a281ac04c35f422a7956387c1d32787139a79b22a87edd9f7933fbdd3fd65819 
INFO[0002] Checking for cached layer 10.0.0.6:5000/test/cache:a281ac04c35f422a7956387c1d32787139a79b22a87edd9f7933fbdd3fd65819... 
DEBU[0002] Failed to retrieve layer: GET http://10.0.0.6:5000/v2/test/cache/manifests/a281ac04c35f422a7956387c1d32787139a79b22a87edd9f7933fbdd3fd65819: MANIFEST_UNKNOWN: manifest unknown; map[Tag:a281ac04c35f422a7956387c1d32787139a79b22a87edd9f7933fbdd3fd65819] 
INFO[0002] No cached layer found for cmd RUN apk add --no-cache tar 
DEBU[0002] Key missing was: sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3-|1-PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin-ARG TEST=123-|2-PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin-TEST=123-RUN apk add --no-cache tar 
INFO[0002] Unpacking rootfs as cmd RUN apk add --no-cache tar requires it. 
DEBU[0003] build: composite key for command ARG TEST=123 &{[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 ARG TEST=123]} 
DEBU[0003] build: cache key for command ARG TEST=123 6c4bf2e8db648ae9c9b1db201aff474c5302bddb61b2444debd20ae0a00ebdb1 
INFO[0003] RUN apk add --no-cache tar                   
INFO[0003] args: [-c apk add --no-cache tar]            
INFO[0003] Running: [/bin/sh -c apk add --no-cache tar] 
DEBU[0004] build: composite key for command RUN apk add --no-cache tar &{[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 ARG TEST=123 |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 RUN apk add --no-cache tar]} 
DEBU[0004] build: cache key for command RUN apk add --no-cache tar a4329b9603a45f299011c221612a695286390059448babb87652103053db4209 
INFO[0004] Pushing layer 10.0.0.6:5000/test/cache:a4329b9603a45f299011c221612a695286390059448babb87652103053db4209 to cache now 
INFO[0004] Pushing image to 10.0.0.6:5000/test/cache:a4329b9603a45f299011c221612a695286390059448babb87652103053db4209 
INFO[0004] Pushed 10.0.0.6:5000/test/cache@sha256:ba36e231d062103b990c1106648f707f841039c6ade8cbb435b045af88479548 
DEBU[0004] mapping digest sha256:e53ebcc21ce659d505b6b9bf7fd251af6d118cf420abbc0ce94ec825b8de9a0f to cachekey a281ac04c35f422a7956387c1d32787139a79b22a87edd9f7933fbdd3fd65819 

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@apollo13
Copy link
Contributor Author

apollo13 commented Feb 2, 2022

Comparing the composite keys I see:

|1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG TEST
|2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 RUN apk add --no-cache tar

vs

|2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 ARG TEST
|2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 RUN apk add --no-cache tar

I am pretty sure that TEST=123 is wrong, this should always be only set after the ARG command ran.

apollo13 added a commit to apollo13/kaniko that referenced this issue Feb 2, 2022
…oogleContainerTools#1910

This seems to be required because otherwise build & optimize calculate
different caches keys.
@apollo13
Copy link
Contributor Author

apollo13 commented Feb 2, 2022

For my simple docker file this fix works:

diff --git a/pkg/executor/build.go b/pkg/executor/build.go
index 995c6737..45b3cae7 100644
--- a/pkg/executor/build.go
+++ b/pkg/executor/build.go
@@ -242,6 +242,13 @@ func (s *stageBuilder) optimize(compositeKey CompositeCache, cfg v1.Config) erro
                        return errors.Wrap(err, "failed to get files used from context")
                }
 
+               // Mutate the config for any commands that require it.
+               if command.MetadataOnly() {
+                       if err := command.ExecuteCommand(&cfg, s.args); err != nil {
+                               return err
+                       }
+               }
+
                compositeKey, err = s.populateCompositeKey(command, files, compositeKey, s.args, cfg.Env)
                if err != nil {
                        return err
@@ -272,13 +279,6 @@ func (s *stageBuilder) optimize(compositeKey CompositeCache, cfg v1.Config) erro
                                s.cmds[i] = cacheCmd
                        }
                }
-
-               // Mutate the config for any commands that require it.
-               if command.MetadataOnly() {
-                       if err := command.ExecuteCommand(&cfg, s.args); err != nil {
-                               return err
-                       }
-               }
        }
        return nil
 }

Since build() runs optimize it sees all the results of the metacommands even if the did not run yet. I think an actually fix needs to do even more, but I am still digging through.

@apollo13
Copy link
Contributor Author

apollo13 commented Feb 2, 2022

Yeah my fix breaks down as soon as I have the following dockerfile:

FROM alpine as build
ARG TEST=123
ARG TEST2=234
RUN apk add --no-cache tar

with the following difference in the composite key:

{[
sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 
|2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 ARG TEST=123 
|3 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 TEST2=234 ARG TEST2=234 
|3 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 TEST2=234 RUN apk add --no-cache tar
]}

vs

{[
sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 
|3 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 TEST2=234 ARG TEST=123 
|3 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST2=234 TEST=123 ARG TEST2=234 
|3 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TEST=123 TEST2=234 RUN apk add --no-cache tar
]} 

@vinman1
Copy link

vinman1 commented Feb 7, 2022

I'm running into exactly the same issue. In my case, the Dockerfile resembles

FROM alpine:latest

ARG ARG1="ARG1_VALUE"

ENV ENV1="ENV1_VALUE"

RUN echo "hi"

I call kaniko this way:

$ /kaniko/executor --destination=my.registry.com/vinman1:latest --context . --dockerfile ./Dockerfile --force --cache=true --cache-copy-layers=true --cache-ttl=672h --cache-repo=my.registry.com/vinman1/cache --compressed-caching=false --verbosity debug

here's the debug log.

Most notably, like mentioned above, the optimize: composite key and build: composite key differ for the command ARG ARG1="ARG1_VALUE":

  • optimize step:
    DEBU[0001] optimize: composite key for command ARG ARG1="ARG1_VALUE" {[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG ARG1=ARG1_VALUE]}
  • build step:
    DEBU[0002] build: composite key for command ARG ARG1="ARG1_VALUE" &{[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG1=ARG1_VALUE ARG ARG1=ARG1_VALUE]}

I share @apollo13 s opinion that the build: composite key for the ARG ARG1="ARG1_VALUE" command is wrong. The ARG1=ARG1_VALUE string should only be added to the environment portion of the composite key for commands subsequent to the ARG definition command itself.

A good example of how it should work is the ENV command. In the same debug log, we see that the last portion of the optimize: composite key and build: composite key match for the ENV ENV1="ENV1_VALUE" command (and the whole thing would match if it were not for this issue):

  • optimize step:
    DEBU[0001] optimize: composite key for command ENV ENV1="ENV1_VALUE" {[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG ARG1=ARG1_VALUE |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG1=ARG1_VALUE ENV ENV1=ENV1_VALUE]}
  • build step:
    DEBU[0002] build: composite key for command ENV ENV1="ENV1_VALUE" &{[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG1=ARG1_VALUE ARG ARG1=ARG1_VALUE |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG1=ARG1_VALUE ENV ENV1=ENV1_VALUE]} 

@vinman1
Copy link

vinman1 commented Feb 8, 2022

Great work @apollo13 👍

I've applied your code change to a local fork of the executor and can confirm that it fixes the issue I described above.

However, while testing, I noticed that there seems to be another issue... ☹️

Lets take this second example Dockerfile:

FROM alpine:latest

ARG ARG3="ARG3_VALUE"
ARG ARG2="ARG2_VALUE"
ARG ARG1="ARG1_VALUE"

RUN echo "hi"

Notice the ordering of the ARG statements. I've noticed that ARGs are not always added to the environment within the layers of a composite key in the same order.

Here's a second debug log.

In particular, look at the optimize: composite key and build: composite key of the echo "hi" RUN command:

  • optimize stage:
    DEBU[0001] optimize: composite key for command RUN echo "hi" {[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG ARG3=ARG3_VALUE |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG3=ARG3_VALUE ARG ARG2=ARG2_VALUE |3 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG3=ARG3_VALUE ARG2=ARG2_VALUE ARG ARG1=ARG1_VALUE |4 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG2=ARG2_VALUE ARG1=ARG1_VALUE ARG3=ARG3_VALUE RUN echo hi]} 
    DEBU[0001] optimize: cache key for command RUN echo "hi" d198b379b40896422ff7ebc8217ea98aa60f21a858f73c19d61129a295bf3d7d
  • build stage:
    DEBU[0002] build: composite key for command RUN echo "hi" &{[sha256:e7d88de73db3d3fd9b2d63aa7f447a10fd0220b7cbf39803c803f2af9ba256b3 |1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG ARG3=ARG3_VALUE |2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG3=ARG3_VALUE ARG ARG2=ARG2_VALUE |3 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG2=ARG2_VALUE ARG3=ARG3_VALUE ARG ARG1=ARG1_VALUE |4 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ARG3=ARG3_VALUE ARG2=ARG2_VALUE ARG1=ARG1_VALUE RUN echo hi]} 
    DEBU[0002] build: cache key for command RUN echo "hi" 834145f1416520996766266283d7f44d5703d3c32dd3ddf62c473f49dbb0044d 

As you can see, there seems to be some re-ordering applied to the last 2 layers of the composite key, but not consistently between the optimize and build stages. This then results in different cache keys genereated between those stages, which leads again to 100% cache misses for this Dockerfile.

@apollo13
Copy link
Contributor Author

apollo13 commented Feb 8, 2022

@vinman1 There seems to be a bug in the ReplacementEnvs function which managed to destroy the underlying slice leading to weird behavior (it also needs sorting to get consistent output since it adds data from a map). I have force pushed a new commit. Please retry with all your examples.

@vinman1
Copy link

vinman1 commented Feb 8, 2022

Ok. @apollo13, I've re-synced my local fork with your last changes and can confirm that they fix both the original problem and the above ordering issue for all my test cases. Again, great work 👏

imjasonh pushed a commit that referenced this issue Feb 9, 2022
* Restore build args after optimize. Fixes #1910, #1912.

* Apply review suggestions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants