Image pull cache misbehavior: cache key calculated over uncompressed layer checksums (vs compressed) #2226

aecay · 2021-07-06T20:23:33Z

Due to a change in a tool that mirrors docker images from public registries to a company-internal one, there are now two very similar versions of some images in our internal registry. Since that has become the case, there have been errors with buildkit (specifically a long-running buildkitd instance in our CI pipeline) building images that come FROM either of these versions. The symptomology of the bug is as follows:

at first, a build FROM either of the images works
after this, further builds FROM the same image work fine

but builds FROM the other image fail with an error like this:

2021-07-06 11:12:15.815  #4 [1/1] FROM registry/core-networking/release/quay.io/cil...
2021-07-06 11:12:15.815  #4 resolve registry/core-networking/release/quay.io/cilium/hubble:v0.5.1-0.0.1@sha256:41ed0dd296bfcfd5c8f4ce53e36824a0c256fc5f9adce10584e37e1e2d1c6f97 0.0s done
2021-07-06 11:12:15.815  #4 ERROR: failed to get record 8i7b094rsfhzf9z3urd2c5fmd by chainid: missing descriptor handlers for lazy blobs [sha256:038f0311438f4b4b2578916e434c8449f4ded30f9b516e325771219b80670cf1]

running buildctl prune on the buildkitd instance resets whatever the problem is, and after that command the first image to be built again "wins" and locks out its doppelganger.

Looking at the results of the logs above, I think I can clearly see something going wrong:

manifests of both images

% crane digest registry/core-networking/release/quay.io/cilium/hubble:v0.5.1-0.0.1
sha256:41ed0dd296bfcfd5c8f4ce53e36824a0c256fc5f9adce10584e37e1e2d1c6f97
% crane manifest registry/core-networking/release/quay.io/cilium/hubble:v0.5.1-0.0.1
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "size": 2365,
      "digest": "sha256:3b2b06a782b31dc7b21a65f6ae8ec2ca30f327cc8c599f0c594c637949a1381d"
   },
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 2813316,
         "digest": "sha256:cbdbe7a5bc2a134ca8ec91be58565ec07d037386d1f1d8385412d224deafca08"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 1916657,
         "digest": "sha256:eeb760e03ef1a4bd98b03c87f409b44e2424ef9b4759c1c458130bc5dfc96f76"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 14461977,
         "digest": "sha256:2bd3f7b5f26019a8f6e2a044576eebf494cc13a50c7b1b59aa03ef455a008b3c"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 1397030,
         "digest": "sha256:96e3156e019aee51c7dc5d21e336fff3a8fe77e1972a7955ff86e4acf3244c6a"
      }
   ]
}
% crane digest registry/core-networking/release/cilium/hubble:v0.5.1-0.1.0
sha256:24785434ffba35f8e3ec6bbbac6a00473cc6577cf9232879abfe750419463904
% crane manifest registry/core-networking/release/cilium/hubble:v0.5.1-0.1.0
{                         
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "size": 2365,
      "digest": "sha256:3b2b06a782b31dc7b21a65f6ae8ec2ca30f327cc8c599f0c594c637949a1381d"
   },
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 2813316,
         "digest": "sha256:038f0311438f4b4b2578916e434c8449f4ded30f9b516e325771219b80670cf1"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 1916657,
         "digest": "sha256:a0cd755b63fda14c297ed6f6a4f910e42af8c7eb0d056d6eeaeea653bb099a88"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 14461977,
         "digest": "sha256:608f0f0c3c2db7708268996bf1dabf1fcd5c47b6b011a6989dfe22421854077e"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 1397030,
         "digest": "sha256:f0038aa1121bc8221891c76fd98bdd6fb6c72f88f5519754e9bfdf640c608964"
      }
   ]
}%

The output above shows that the "locked out" image with sha digest 41ed0... does NOT contain the layer with digest sha256:038f0.... That layer is instead contained by the doppelganger image whose digest is 24785.... In other words, the error pasted above shows (I believe) buildkit doing something invalid: trying to access a layer that doesn't even belong to the image that the docker build is FROM. Somehow, though, the cache must be getting corrupted so that this happens.

I noticed that the images both have identical config. I believe what's happening in the code here is that a cache key for the image is being computed based on the diffids, which are hashes of the uncompressed tar files for the layers. However, it can happen that the image layers get recompressed, as has evidently happened to some of the images we archived to our registry a while ago and are now rearchiving again with the new tool. Recompression will change the sha digests of the layers themselves, as stored in the image manifest (and in the registry).

So when the first image is built, buildkit pulls the layers compressed in one way, but marks the cache as being populated with data equivalent to the uncompressed layers. When the image based on the second compression flavor is built, the cache doesn't contain the (compressed) data that its (uncompressed) keys say it should have, an invariant is violated, and the process errors out.

The simplest way to solve this would be (I think) to calculate the cache key based on the chained compressed layer digests, not (as now) the chained uncompressed digests. That leads to "unnecessary" cache misses when data has been downloaded under compression scheme X but is missing under Y -- but the bookkeeping to match up compressed and uncompressed layer checksums seems like it would be fairly baroque for marginal benefit.

I think I've finally grokked this, although it has been a long journey of discovery to get me here. I think at this point the resolution of this issue could use:

confirmation from someone more familiar than I with the buildkit code base/docker ecosystem and file formats whether my analysis of the problem is correct
indication from the maintainers of whether the proposed fix (to change the calculation of the cache key so that it's based on compressed layer digests) is acceptable/another fix is preferred
Someone™ to code whatever fix is decided on

Thanks, and please do let me know if there's something further I can provide to aid diagnosis/understanding.

The text was updated successfully, but these errors were encountered:

tonistiigi · 2021-07-06T21:46:14Z

@sipsma

sipsma · 2021-07-07T16:12:07Z

@aecay Many thanks for the detailed and helpful report, your analysis of the problem seems spot on and I was able to reproduce the behavior with a unit test.

I've opened #2231 with a fix. While your suggestion would work, I think we actually want the cache map to be the same for these images because, in terms of cache, we really care about the contents of the final unpacked filesystem, not the format those filesystems were transferred to us in. Instead, I just changed the code to ignore the error you were getting; it was basically trying to optimize by re-using the unpacked snapshot of the other image, but couldn't do so because it turns out the other image is "lazy", which means it exists in the cache but the actual pull of it has been put off until later. Ignoring the error is fine there as it just skips that optimization which wasn't possible.

aecay · 2021-07-07T17:11:58Z

Brilliant, thanks for the explanation and the very quick fix 🙂

sipsma mentioned this issue Jul 7, 2021

Ignore missing providers for blobs w/ same chainid. #2231

Merged

tonistiigi closed this as completed in #2231 Jul 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image pull cache misbehavior: cache key calculated over uncompressed layer checksums (vs compressed) #2226

Image pull cache misbehavior: cache key calculated over uncompressed layer checksums (vs compressed) #2226

aecay commented Jul 6, 2021

tonistiigi commented Jul 6, 2021

sipsma commented Jul 7, 2021

aecay commented Jul 7, 2021

Image pull cache misbehavior: cache key calculated over uncompressed layer checksums (vs compressed) #2226

Image pull cache misbehavior: cache key calculated over uncompressed layer checksums (vs compressed) #2226

Comments

aecay commented Jul 6, 2021

tonistiigi commented Jul 6, 2021

sipsma commented Jul 7, 2021

aecay commented Jul 7, 2021