Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added rough guide on how to cache a docker image #6288

Closed

Conversation

lawrencegripper
Copy link

This is a very rough guide on how to use the cache task to speed up docker builds. Would love to hear feedback on how to get this in a good state to merge.

@PRMerger17
Copy link
Contributor

@lawrencegripper : Thanks for your contribution! The author(s) have been notified to review your proposed change.

@lawrencegripper
Copy link
Author

Any thoughts on getting this merged or updated?

@ktoliver
Copy link
Contributor

ktoliver commented Dec 2, 2019

Hi @willsmythe - Could you review the pull request and indicate whether it can be merged? Thanks.

@willsmythe
Copy link
Collaborator

@johnterickson is probably the right person to review.

@ktoliver
Copy link
Contributor

ktoliver commented Dec 2, 2019

Hi @johnterickson - Are you able to review the PR and indicate whether it should be merged? Thanks.

Note that you don't show up as a user in the GitHub Assignees feature (top right), which indicates that your GitHub account isn't fully linked to your Microsoft account as described in Link your GitHub and Microsoft accounts; I can't assign the PR to you.

@johnterickson
Copy link
Contributor

Thanks for submitting this @lawrencegripper !

Have you compared to what we have here: microsoft/azure-pipelines-tasks#11034 (comment) ?

We found "docker load" from local file to be slower than pulling it from a remote registry 😬

@lawrencegripper
Copy link
Author

lawrencegripper commented Dec 3, 2019

So the approach detailed here? https://github.com/fadnavistanmay/Scripts/blob/c7739a722528d7e27cc0f4f6e1adbaf4a1392977/docker.yml

In terms of simplicity I'd rather use the cache than have to setup container registries etc seems like a painful amount of admin. Competitors in this space do this nicely -> https://circleci.com/docs/2.0/docker-layer-caching/

One approach I tried get working was to identify the docker storage driver used and backup the layers directly from the /lib/docker file but the cache task wasn't running as a user with the appropriate permissions.

In terms of speed I've also observed the approach being slow - one reason appears to be because the throughput when downloading the image from cache is slower than I'd expect. I was expecting the cache to be a mounted pageblob/AzureFiles Premium onto the VM from Azure Blob rather than a download step (essentially bring the data to the node not have the node download the data).

The docker load call takes around 2mins for the 2.5GB image.

-------> Restoring docker image
Loaded image: devcontainer:latest

real	2m9.119s
user	0m0.608s
sys	0m1.735s

@stefankip
Copy link

Any updates?

@lawrencegripper
Copy link
Author

lawrencegripper commented Dec 18, 2019

Gonna close this one off, as there are different approaches maybe it's better suited to a blog post.

For discoverability it would be awesome to see one of the preferred approaches highlighted in the caching doc - even if only to say "Don't use the cache feature for docker images, push them to a repository instead it's better"

@lawrencegripper lawrencegripper deleted the patch-1 branch December 18, 2019 09:09
@stefankip
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants