Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-stage builds silently crashing #2249

Closed
alucryd opened this issue Sep 15, 2022 · 5 comments
Closed

Multi-stage builds silently crashing #2249

alucryd opened this issue Sep 15, 2022 · 5 comments
Labels
area/errorhandling For all bugs having to do with handling problems during kaniko execution area/multi-arch categorized differs-from-docker issue/hang issue/oom priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. works-with-docker

Comments

@alucryd
Copy link

alucryd commented Sep 15, 2022

Actual behavior
The kaniko build silently crashes after taking the full filesystem snapshot with no useful error. Works fine with dind. Disabling the Kaniko cache doesn't help.

Expected behavior
The build should complete with no issue.

To Reproduce
Steps to reproduce the behavior:

  1. Have your Gitlab Runner in a GKE autopilot
  2. Run your gitlab CI with Kaniko instead of dind

Additional Information

  • Dockerfile
FROM node:16 as builder
COPY . /app
WORKDIR /app
RUN yarn install --frozen-lockfile --production

FROM gcr.io/distroless/nodejs:16
COPY --from=builder /app /app
WORKDIR /app
EXPOSE 8080
CMD ["--experimental-modules", "--experimental-json-modules", "src/server.js"]
  • Build Context
    Multistage build, first stage copies the Express JS app and installs dependencies, second stage reuses the app directory to produce a distroless image.
  • Kaniko Image (fully qualified with digest)
    gcr.io/kaniko-project/executor:a8498c762f34aabc62966c69169b79a04e04a4d5-debug, v1.9.0-debug

Triage Notes for the Maintainers

CI log:

Executing "step_script" stage of the job script
01:09
$ mkdir -p /kaniko/.docker
$ echo "{\"auths\":{\"${CI_REGISTRY}\":{\"auth\":\"$(echo -n ${CI_REGISTRY_USER}:${CI_REGISTRY_PASSWORD} | base64)\"}}}" > /kaniko/.docker/config.json
$ /kaniko/executor --context ${CI_PROJECT_DIR} --dockerfile ${CI_PROJECT_DIR}/Dockerfile --destination ${CI_REGISTRY_IMAGE}:${TAG} --destination ${CI_REGISTRY_IMAGE}:${LATEST_TAG}
INFO[0000] Resolved base name node:16 to builder        
INFO[0000] Retrieving image manifest node:16            
INFO[0000] Retrieving image node:16 from registry index.docker.io 
INFO[0001] Retrieving image manifest gcr.io/distroless/nodejs:16 
INFO[0001] Retrieving image gcr.io/distroless/nodejs:16 from registry gcr.io 
INFO[0002] Built cross stage deps: map[0:[/app]]        
INFO[0002] Retrieving image manifest node:16            
INFO[0002] Returning cached image manifest              
INFO[0002] Executing 0 build triggers                   
INFO[0002] Building stage 'node:16' [idx: '0', base-idx: '-1'] 
INFO[0002] Unpacking rootfs as cmd COPY . /app requires it. 
INFO[00[45](https://gccsg.saint-maclou.com/nodejs/app-reseau-api/-/jobs/371#L45)] COPY . /app                                  
INFO[0052] Taking snapshot of files...                  
INFO[0061] WORKDIR /app                                 
INFO[0061] Cmd: workdir                                 
INFO[0061] Changed working directory to /app            
INFO[0061] No files changed in this command, skipping snapshotting. 
INFO[0061] RUN yarn install --frozen-lockfile --production 
INFO[0061] Initializing snapshotter ...                 
INFO[0061] Taking snapshot of full filesystem...        
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: pod "runner-yrykheow-project-61-concurrent-0gkkzz" status is "Failed"
Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@alucryd
Copy link
Author

alucryd commented Sep 15, 2022

Doubling memory request to 4Gi didn't help, so it doesn't appear to be OOM killed, I also tried 1.8.1 and 1.7.0, same result.

@alucryd
Copy link
Author

alucryd commented Sep 16, 2022

FYI, going back to single-stage works, so this is an issue with multi-stage.

@alucryd
Copy link
Author

alucryd commented Sep 21, 2022

Got another multistage Dockerfile that is crashing, unfortunately that one can't easily be converted to single stage.

FROM node:16 as builder

COPY . /app
WORKDIR /app
RUN yarn install --frozen-lockfile

ARG VITE_HIDE_INTERNAL
ARG VITE_HIDE_TRY_IT
ENV VITE_HIDE_INTERNAL=$VITE_HIDE_INTERNAL
ENV VITE_HIDE_TRY_IT=$VITE_HIDE_TRY_IT

RUN yarn build

FROM flashspys/nginx-static
COPY --from=builder /app/build /static
EXPOSE 80

@alucryd alucryd changed the title Silent crash after full filesystem snapshot Multi-stage builds silently crashing Sep 21, 2022
@aaron-prindle aaron-prindle added area/multi-arch area/errorhandling For all bugs having to do with handling problems during kaniko execution priority/p1 Basic need feature compatibility with docker build. we should be working on this next. differs-from-docker works-with-docker priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. issue/hang issue/no-space-left categorized issue/oom and removed priority/p1 Basic need feature compatibility with docker build. we should be working on this next. issue/no-space-left labels Jun 21, 2023
@JeromeJu JeromeJu removed the priority/p1 Basic need feature compatibility with docker build. we should be working on this next. label Oct 24, 2023
@JeromeJu
Copy link
Collaborator

JeromeJu commented Oct 24, 2023

Looks like with the latest kaniko @Head (v1.17.0), I am not seeing the same error for the repo:

jeromeju@jju:~/kaniko$ ./run_in_docker.sh /dockerfile /usr/local/google/home/jeromeju/kaniko gcr.io/jju-dev/test:latest
INFO[0000] Resolved base name node:16 to builder        
INFO[0000] Using dockerignore file: /workspace/.dockerignore 
INFO[0000] Retrieving image manifest node:16            
INFO[0000] Retrieving image node:16 from registry index.docker.io 
INFO[0000] Retrieving image manifest gcr.io/distroless/nodejs:16 
INFO[0000] Retrieving image gcr.io/distroless/nodejs:16 from registry gcr.io 
INFO[0001] Built cross stage deps: map[0:[/app]]        
INFO[0001] Retrieving image manifest node:16            
INFO[0001] Returning cached image manifest              
INFO[0001] Executing 0 build triggers                   
INFO[0001] Building stage 'node:16' [idx: '0', base-idx: '-1'] 
INFO[0001] Unpacking rootfs as cmd COPY . /app requires it. 
INFO[0022] COPY . /app                                  
INFO[0027] Taking snapshot of files...                  
INFO[0030] WORKDIR /app                                 
INFO[0030] Cmd: workdir                                 
INFO[0030] Changed working directory to /app            
INFO[0030] No files changed in this command, skipping snapshotting. 
INFO[0030] RUN yarn install --frozen-lockfile --production 
INFO[0030] Initializing snapshotter ...                 
INFO[0030] Taking snapshot of full filesystem...        
INFO[0036] Cmd: /bin/sh                                 
INFO[0036] Args: [-c yarn install --frozen-lockfile --production] 
INFO[0036] Running: [/bin/sh -c yarn install --frozen-lockfile --production] 
yarn install v1.22.19
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
Done in 0.06s.
INFO[0037] Taking snapshot of full filesystem...        
INFO[0039] Saving file app for later use                
INFO[0041] Deleting filesystem...                       
INFO[0043] Retrieving image manifest gcr.io/distroless/nodejs:16 
INFO[0043] Returning cached image manifest              
INFO[0043] Executing 0 build triggers                   
INFO[0043] Building stage 'gcr.io/distroless/nodejs:16' [idx: '1', base-idx: '-1'] 
INFO[0043] Unpacking rootfs as cmd COPY --from=builder /app /app requires it. 
INFO[0045] COPY --from=builder /app /app                
INFO[0047] Taking snapshot of files...                  
INFO[0050] WORKDIR /app                                 
INFO[0050] Cmd: workdir                                 
INFO[0050] Changed working directory to /app            
INFO[0050] No files changed in this command, skipping snapshotting. 
INFO[0050] EXPOSE 8080                                  
INFO[0050] Cmd: EXPOSE                                  
INFO[0050] Adding exposed port: 8080/tcp                
INFO[0050] CMD ["--experimental-modules", "--experimental-json-modules", "src/server.js"] 
INFO[0050] Pushing image to gcr.io/jju-dev/test:latest  
INFO[0053] Pushed gcr.io/jju-dev/test@sha256:d9b6d976408fa96a357f0dcb96856c544649cf3d31fac7bf3baf579b43c4175e 

Would you mind providing some updates on the current issue if it persists or we might close this?

@alucryd
Copy link
Author

alucryd commented Oct 25, 2023

Apologies, I completely forgot to get back to you. It started working fine some time ago, this issue can definitely be closed. Thanks for the update!

@alucryd alucryd closed this as completed Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/errorhandling For all bugs having to do with handling problems during kaniko execution area/multi-arch categorized differs-from-docker issue/hang issue/oom priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. works-with-docker
Projects
None yet
Development

No branches or pull requests

3 participants