Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "build: Use different cache and output registries" #155

Merged
merged 2 commits into from
May 16, 2023

Conversation

tsibley
Copy link
Member

@tsibley tsibley commented May 12, 2023

This reverts commit 917a048 and fixes the comment added by efbbfb9.

We need to cache from ghcr.io because otherwise the build of nextstrain/base doesn't find the layers from the just-built nextstrain/base-builder pushed to ghcr.io. Instead, it seemingly uses layers from an older nextstrain/base-builder on docker.io.

My understanding is that we were always pulling caches from docker.io (and ghcr.io) because docker.io is the default registry when none is specified, and we use --cache-from lines without a registry specified.

See https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1683932186614689.

Testing

  • Checks pass

This reverts commit 917a048 and fixes
the comment added by efbbfb9.

We need to cache from ghcr.io because otherwise the build of
nextstrain/base doesn't find the layers from the just-built
nextstrain/base-builder pushed to ghcr.io.  Instead, it seemingly uses
layers from an older nextstrain/base-builder on docker.io.

My understanding is that we were always pulling caches from docker.io
(_and_ ghcr.io) because docker.io is the default registry when none is
specified, and we use --cache-from lines without a registry specified.

See <https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1683932186614689>.
@tsibley
Copy link
Member Author

tsibley commented May 12, 2023

This should fix the issue described by the linked Slack thread. To check, do:

$ docker run --rm -it nextstrain/base-builder:branch-cache-from-ghcr.io augur --version
22.0.0
$ docker run --rm -it nextstrain/base:branch-cache-from-ghcr.io augur --version
22.0.0

Both should be 22.0.0. If base-builder is 22.0.0 and base is 21.1.0, then the bug isn't fixed.

Someone else (maybe @huddlej) should check next week and merge if it looks good.

Copy link
Member

@corneliusroemer corneliusroemer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested as explained by @tsibley and it doesn't seem to work for me - or do I need to build first?

$ docker run --rm -it nextstrain/base-builder:branch-cache-from-ghcr.io augur --version
Unable to find image 'nextstrain/base-builder:branch-cache-from-ghcr.io' locally
branch-cache-from-ghcr.io: Pulling from nextstrain/base-builder
Digest: sha256:cb6a55e7c4bd4d9efb9f8fd6ed900d1a11f8f481096c22db794d5b2f3c9347e3
Status: Downloaded newer image for nextstrain/base-builder:branch-cache-from-ghcr.io
augur 22.0.0
$ docker run --rm -it nextstrain/base:branch-cache-from-ghcr.io augur --version
Unable to find image 'nextstrain/base:branch-cache-from-ghcr.io' locally
branch-cache-from-ghcr.io: Pulling from nextstrain/base
Digest: sha256:c50bf9fdca3b52ec43458a7d11188f8778bf12aa06a933de8be6d4cc01ca9833
Status: Downloaded newer image for nextstrain/base:branch-cache-from-ghcr.io
augur 21.1.0

@huddlej
Copy link
Contributor

huddlej commented May 15, 2023

Darn, I get the same result as @corneliusroemer. @tsibley and @victorlin are out this week, so we may need to continue troubleshooting on our own, Cornelius...

@joverlee521
Copy link
Contributor

Yeah, this isn't resolving the issue.
Looking at the "Created" date for the images:

$ docker inspect -f '{{.Created}}'  nextstrain/base-builder:branch-cache-from-ghcr.io
2023-05-12T23:43:54.823419242Z
$ docker inspect -f '{{.Created}}' nextstrain/base:branch-cache-from-ghcr.io
2023-05-08T22:30:15.085923811Z

@joverlee521
Copy link
Contributor

Looking at the test run logs for this PR, it seems like the new branch ghcr.io builder image is being pulled when building the final image. However, the COPY commands from the builder to the final image are still using the cache.

This leads to my current question of how does Docker handle multiple --cache-from sources. I did not find anything in the docker docs, but I did find an old comment on the expected behavior:

When using multiple --cache-from they are checked for a cache hit in the order that user specified. If one of the images produces a cache hit for a command only that image is used for the rest of the build.

I'm going to try go reorder our --cache-from sources for the final image to see if that will affect the outcome.

@joverlee521
Copy link
Contributor

Well, reordering did not work 😞

Instead of continuing to bang my head on this, I just removed the --cache-from that sources the base-builder image from docker.io to ensure that we only use the latest base-builder image that is on ghcr.io.

@joverlee521 joverlee521 force-pushed the cache-from-ghcr.io branch from 997b702 to 4470eff Compare May 16, 2023 20:22
@joverlee521
Copy link
Contributor

Urgh, so it seems like the base-builder image is not the issue since removing the cache-from docker.io did not fix this issue.

Somehow when building the final image, Docker is not detecting the changes from the final image cache. The latest push removes the cache for the final image from docker.io so it should run all of the COPY commands.

I _think_ Docker checks for the matching cache that is provided using the
correct CACHE_DATE, which will allow Docker to use the latest built
builder image.
@joverlee521 joverlee521 force-pushed the cache-from-ghcr.io branch from 4470eff to b762669 Compare May 16, 2023 21:27
@joverlee521
Copy link
Contributor

joverlee521 commented May 16, 2023

I think I found the (additional) underlying problem that is leading to the weird cache.

In a recent commit, the CACHE_DATE arg was removed from the final build stage. I think that means when Docker is building the final image, it will not be able to find the base-builder cache layer that uses the same CACHE_DATE. If we provide it again, it should be able to figure the correct caches to use for building the final image.

@joverlee521 joverlee521 merged commit 4a8cf85 into master May 16, 2023
@joverlee521 joverlee521 deleted the cache-from-ghcr.io branch May 16, 2023 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

4 participants