-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF #4305
Comments
the error in the logs above is: |
Is there an additional stack trace in the logs@eoshea-cmt? Without that, there's not enough information to work out whether this is a duplicate of #4157, or a separate issue. |
@jedevc I shared the stack trace in the "Build logs" section of my issue, see here: |
cc @tonistiigi
It looks like |
@jedevc is there someone I can reach out to about getting the version of buildkit with this fixed included in docker-compose? |
Description
During docker (compose) builds, we occasionally see this error in our CI:
failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
This can happen at various stages in docker builds, including:
We used our instance monitoring to investigate if there was any correlation with resource uses. We looked into network, memory, and cpu utilization and none of these spiked in correlation to these errors.
This error can kill multiple builds happening in parallel on our CI nodes, but it also happens to single builds as well.
Expected behaviour
docker compose build progress
Actual behaviour
docker compose builds fail
Buildx version
github.com/docker/buildx v0.11.2 9872040
Docker info
Builders list
Configuration
We are not able to consistently reproduce our issues, though we are building multiple images with multiple stages using docker-compose which may be relevant
we also run multiple jobs on the same instances in our CI, so multiple docker compose builds are happening in parallel at times. Furthermore it seems this error can happen to multiple docker compose builds at the same time which running on the same node in parallel.
Build logs
dockerlog.txt
Additional info
I previously created this ticket for buildx before the conversation pointed to this being a buildkit issue
seems like it could be a similar (but different) error to:
microsoft/vscode-remote-release#7958
or #4157
I'm wondering if it is some other race condition that only happens occasionally.
It does not seem correlated to resource usage.
The text was updated successfully, but these errors were encountered: