Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: #525 causes missing files when caching is enabled in some situations #631

Closed
discordianfish opened this issue Mar 28, 2019 · 5 comments

Comments

@discordianfish
Copy link
Contributor

discordianfish commented Mar 28, 2019

Actual behavior
First I observed that after enabling caching and building once, subsequent builds appear to miss the files the cached steps should have created. Disabling the cache fixes the problem.

When running the builds with the debug image, I'm see that it uses the cache here as expected:

INFO[0001] Using caching version of cmd: RUN mix local.hex --force

This should have created files that appears to be missing the the next steps, since they fail with:

NFO[0045] Using files from context: [/src]
INFO[0045] COPY . /app
INFO[0046] Taking snapshot of files...
INFO[0046] RUN mix compile
INFO[0046] cmd: /bin/sh
INFO[0046] args: [-c mix compile]
"File operation error: badarg. Target: .. Function: read_file_info. Process: code_server."
=ERROR REPORT==== 28-Mar-2019::15:55:29 ===
File operation error: badarg. Target: .. Function: read_file_info. Process: code_server.
15:55:29.584 [error] 'File operation error: badarg. Target: /root/.mix/archives/hex-0.19.0/hex-0.19.0/ebin. Function: read_file_info. Process: code_server.' Could not find Hex, which is needed to build dependency :phoenix
2019-03-28 15:55:29 std_error           ** (Mix) Could not find an SCM for dependency :phoenix from Gateway.Mixfile
Shall I install Hex? (if running non-interactively, use "mix local.hex --force") [Yn] error building image: error building stage: waiting for process to exit: exit status 1

The Dockerfile is pretty generic:

FROM elixir:1.6.1
MAINTAINER redacted

ENV MIX_ENV prod
ENV PORT 3000
ENV METRICS_PORT 3001

RUN mix local.hex --force
RUN mix local.rebar --force

WORKDIR /app

ADD mix.exs /app/mix.exs
ADD mix.lock /app/mix.lock
ADD config /app/config
RUN mix deps.get && mix compile

ADD . /app
RUN mix compile

CMD mix phoenix.server

Expected behavior
It should just work™️

To Reproduce
Haven't found an good way to reproduce (yet).

Additional Information

  • Kaniko Image (fully qualified with digest): gcr.io/kaniko-project/executor@sha256:d9fe474f80b73808dc12b54f45f5fc90f7856d9fc699d4a5e79d968a1aef1a72
@discordianfish discordianfish changed the title Caching reuses wrong layer Caching reuses empty/wrong layer Mar 28, 2019
@discordianfish discordianfish changed the title Caching reuses empty/wrong layer Cached layers not getting restored properly Mar 28, 2019
@discordianfish
Copy link
Contributor Author

Removed my former comment, that was a red herring. It's probably not missing cache files. I've just copies the root filesystem from the kaniko container over (using docker cp) and chroot'ed into it. There I can run mix compile just fine. So something is weird with how it gets invocated or something is shadowing the directory.

@discordianfish
Copy link
Contributor Author

It's a complete mystery to me and the best heisenbug I came across. I earlier on added find statements before the ADD statement to the dockerfile to verify the existence of files, now I've did this:

diff --git a/Dockerfile b/Dockerfile
index db18be9..7a8f11a 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -16,6 +16,6 @@ COPY config /app/config
 RUN mix deps.get && mix compile

 COPY . /app
-RUN mix compile
+RUN find /app && find /root && mix compile

 CMD mix phoenix.server

...and I can't reproduce it anymore. Now all builds of this pass with caching enabled:

INFO[0001] Using caching version of cmd: RUN mix deps.get && mix compile
INFO[0001] Using files from context: [/src]
INFO[0001] Checking for cached layer xx.dkr.ecr.us-east-1.amazonaws.com/yy/zz/cache:dea4e98
8581f7550317d99d8ddea7cd74eb5f4b45341bb2fbc31fe6507c8eade...
INFO[0001] No cached layer found for cmd RUN find /app && find /root && mix compile

@discordianfish
Copy link
Contributor Author

Removingfind /app && find /root && makes the problem re-appears. I've tried older versions and apparently this is a regression that was introduced between 0.7.0 (which works fine) and 0.8.0 (which doesn't). Here are the commits that happened between these versions: v0.7.0...v0.8.0

I'll try to revert the most likely candidates and see if I can get a version off from master without the bug.

@discordianfish
Copy link
Contributor Author

I've just checked out master and reverted 1ffae47, which was added by #525, and built a new image. That fixes the problem. I can now use the cache just fine.

To double check, I also built from master with that commit included and it fails, so I can say with high confidence that 1ffae47 is introducing this behavior.

@discordianfish discordianfish changed the title Cached layers not getting restored properly Regression: #525 causes missing files when caching is enabled in some situations Apr 1, 2019
@donmccasland
Copy link
Member

Closing on merged fix. Reopen if still an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants