Add HEAD endpoints #207

jdolitsky · 2020-10-28T21:51:35Z

https://docs.docker.com/registry/spec/api/#existing-manifests

"Docker’s use of HEAD is also has a Docker-Content-Digest in the response headers so you can HEAD against a tag to determine the digest that the tag currently points to."

The text was updated successfully, but these errors were encountered:

dmcgowan · 2020-10-29T16:09:54Z

What exactly is the addition recommended here? HEAD is already well defined as being the same as GET without the body and shouldn't be considered separate endpoints. Is the point to highlight the use of Docker-Content-Digest?

I am +1 for calling out explicitly that HEAD should be supported though and having that in conformance tests. There have been issues in the past of registries returning different response code or headers for HEAD.

jdolitsky · 2020-10-29T16:16:49Z

There was some convo yesterday about this - containerd uses HEAD and falls back to GET. So the plan is to add to conformance, and also add it to the endpoints table

simskij · 2020-11-05T13:01:16Z

Given DockerHub's recent implementation of rate limits, an interesting issue surfaced around the behavior of HEAD. If you do a HEAD request against a manifest-list, the Docker-Content-Digest returned is not for the actual manifest list but for the resolved manifest depending on os and arch. In the case of DockerHub at least, this seems to default to linux/amd64 unless overridden.

Docker, and I assume other container environments as well, store the manifest list's digest as the actual image digest, without resolving the digest of the manifest it's redirected t, it would probably make sense if a HEAD request against a manifest list would return the digest of that list, matching what's being stored locally.

There is already a way around this by supplying a "Accept:application/vnd.docker.distribution.manifest.list.v2+json" header to the request, however, this is not currently done. So, my suggestion basically comes down to making application/vnd.docker.distribution.manifest.list.v2+json the default Content-Type for head requests.

I get that the ideal thing here would be to change the digests stored by docker-engine, but given the wide adoption of docker-ce, and the wide array of versions being used, any changes like that would probably take years to propagate fully into the wild.

Some repros using DockerHub. I appreciate any challenges to my understanding as I may very well have gotten it wrong.

$ docker inspect containrrr/watchtower
...
"RepoDigests": [
            "containrrr/watchtower@sha256:d0331edc5b1c5bbf18a92fc27c50f32e8bb894cf67a06cbd33d04eb40d5c8cc2"
        ],
...

Curl response with the header added:

$ curl -v -H "Authorization: Bearer $TOKEN" -H "Accept:application/vnd.docker.distribution.manifest.list.v2+json" https://registry-1.docker.io/v2/containrrr/watchtower/manifests/latest 2>&1 | grep Digest
< Docker-Content-Digest: sha256:d0331edc5b1c5bbf18a92fc27c50f32e8bb894cf67a06cbd33d04eb40d5c8cc2

and without, defaulting to application/vnd.docker.distribution.manifest.v1+prettyjws

$ curl -v -H "Authorization: Bearer $TOKEN" https://registry-1.docker.io/v2/containrrr/watchtower/manifests/latest 2>&1 | grep Digest
< Docker-Content-Digest: sha256:e24b7c874f4e514676a3aeca424ed55a03d32ea1095925d8e37f910bfea3d782

The result of this currently seems to be that all HEAD requests against multi-arch/manifest-lists are considered to indicate that the digest has changed, leading to an actual GET of the full manifest.

amouat · 2020-11-05T13:13:22Z

The current draft of the OCI spec has removed mention of the Docker-Content-Digestheader completely. As this enables some useful workflows, I think it would be good to add it back and also to the conformance tests (I only realised this week that Trow doesn't set it currently).

I understand we probably want to remove the word Docker, but perhaps that can wait until later?

thaJeztah · 2020-11-05T13:25:11Z

I would expect the specs to describe a Content-Digest header (without the Docker prefix), but allow registries to return a Docker-Content-Digest header for backward compatibility with older clients (not sure if that would be part of the spec, as "any other header" likely is allowed).

If it does describe the Docker prefixed one, it should define that clients that consume the Content-Digest (I'd gather using it would always be optional), that it MUST prefer Content-Digest over Docker-Content-Digest (i.e., "ignore" Docker-Content-Digest if both are present).

amouat · 2020-11-05T14:04:32Z

Thanks @thaJeztah - to be honest I wanted to say the same and gave up trying to word it correctly! 🤦

jdolitsky · 2020-11-05T23:05:10Z

Hi all - this conversation has split into two separate threads. Please see the conclusions here: #208 (comment)

jonjohnsonjr · 2020-11-11T00:30:36Z

Given DockerHub's recent implementation of rate limits, an interesting issue surfaced around the behavior of HEAD. If you do a HEAD request against a manifest-list, the Docker-Content-Digest returned is not for the actual manifest list but for the resolved manifest depending on os and arch. In the case of DockerHub at least, this seems to default to linux/amd64 unless overridden.

Even worse than that, it's down-converted to a schema 1 image.

Even more worse, there used to be a bug where you'd see the manifest list digest even if it returned a schema 2 image (based on your accept headers). That's luckily now been fixed: distribution/distribution#2395

There is already a way around this by supplying a "Accept:application/vnd.docker.distribution.manifest.list.v2+json" header to the request, however, this is not currently done. So, my suggestion basically comes down to making application/vnd.docker.distribution.manifest.list.v2+json the default Content-Type for head requests.

I think you mean Accept instead of Content-Type, but that only makes sense if the manifest on the other side is a manifest list. Clients should supply a list of content types that they support in the Accept header, which is covered here: https://docs.docker.com/registry/spec/api/#pulling-an-image-manifest

~~I am really confused about why half of the registry spec was thrown away -- this stuff is vital to a correct implementation.~~ (sorry)

We should make sure we haven't left out any other details like this. It should be possible to produce a client that works against existing registries just from reading the spec (especially docker hub).

jdolitsky · 2020-11-11T03:49:31Z

Hey @jonjohnsonjr - in response to

I am really confused about why half of the registry spec was thrown away -- this stuff is vital to a correct implementation.

We can use your knowledge and experience here. The goal is to make things digestible to newcomers to this specification. If things were left out that you view to be vital, we are probably missing some context.

Please reach out via slack/email, I'd like to setup some time to meet and address any concerns.

jonjohnsonjr · 2020-11-11T20:38:10Z

Apologies for that last remark -- I am genuinely confused here, and didn't intend for that to come off so rudely. I know this is a lot of work and it's challenging to tackle a document of this size, so please don't take it personally :)

I'll find you on slack to follow up.

simskij · 2020-11-12T07:37:07Z

I think you mean Accept instead of Content-Type, but that only makes sense if the manifest on the other side is a manifest list. Clients should supply a list of content types that they support in the Accept header, which is covered here:

No, I mean Content-Type. I'm saying that, if it is a manifest list, and no Accept has been provided, it would make a lot more sense if it defaulted to returning the manifest list rather than picking a manifest (seems like it defaults to linux/amd64) and return that.

The behavior is not obvious and kind of counter-intuitive currently, as it's not returning the type that matches my request the closest, in this case: a tag that happens to point at a manifest list.

thaJeztah · 2020-11-12T08:54:15Z

In the case of DockerHub at least, this seems to default to linux/amd64 unless overridden.

Even worse than that, it's down-converted to a schema 1 image.

For the Docker Hub case, both have been done to remain backward-compatible;

default to linux/amd64, as it was the only OS and Architecture originally supported (and thus, would be the default image format selected)
serving v1 manifests, to keep backward compatibility with old clients which would not send (appropriate) Accept headers.

No, I mean Content-Type. I'm saying that, if it is a manifest list, and no Accept has been provided, it would make a lot more sense if it defaulted to returning the manifest list rather than picking a manifest (seems like it defaults to linux/amd64) and return that.

@simskij see my comment above; for the Docker Hub case, this is done to remain backward compatible. I also was reading through the HTTP RFCs for content negotiation Yesterday (more below), and both variants are "valid". However, in case of the "no Accept" header, and if a server decides not to do active content-negotiation and return a manifest-list, a 406 status response could possibly be more appropriate. That said, perhaps the Accept header For v2-capable clients should be required?

Testing some variations against Docker Hub, here's how it currently handles content-negotiation:

✅ For a multi-arch repository, specifying multiple Accept headers returns the manifest list (active content-negotiation; leaving it up to the client to pick the best-matching variant);

export token="$(curl -fsSL "https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/hello-world:pull" | jq --raw-output '.token')";

curl -X HEAD -I -fsSL -H "Authorization: Bearer $token" \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
    -H 'Accept: application/vnd.docker.distribution.manifest.list.v2+json' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v1+json' \
    "https://registry-1.docker.io/v2/library/hello-world/manifests/latest"

HTTP/1.1 200 OK
Content-Type: application/vnd.docker.distribution.manifest.list.v2+json
...

✅ Using Accept with only v1 manifest, returns the v1 manifest (same as when no Accept header is set) (again, "best match"):

export token="$(curl -fsSL "https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/hello-world:pull" | jq --raw-output '.token')";

curl -X HEAD -I -fsSL -H "Authorization: Bearer $token" \
    -H 'Accept: application/vnd.docker.distribution.manifest.v1+json' \
    "https://registry-1.docker.io/v2/library/hello-world/manifests/latest"

HTTP/1.1 200 OK
Content-Type: application/vnd.docker.distribution.manifest.v1+prettyjws
...

✅ On a single-arch repository (armhf/hello-world:latest), specifying multiple Accept headers returns the v2 manifest (best match for the given Accept headers):

export token="$(curl -fsSL "https://auth.docker.io/token?service=registry.docker.io&scope=repository:armhf/hello-world:pull" | jq --raw-output '.token')";

curl -X HEAD -I -fsSL -H "Authorization: Bearer $token" \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
    -H 'Accept: application/vnd.docker.distribution.manifest.list.v2+json' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v1+json' \
    "https://registry-1.docker.io/v2/armhf/hello-world/manifests/latest"

HTTP/1.1 200 OK
Content-Type: application/vnd.docker.distribution.manifest.v2+json
...

⚠️ Using Accept with only v2 manifest-list, on a single-arch repository returns the v1 manifest

This is one is odd, and feels like a bug/omission. Based on the presence of a v2 Accept header, the registry should probably better return the v2 image manifest (or generate a manifest list on-the-fly):

export token="$(curl -fsSL "https://auth.docker.io/token?service=registry.docker.io&scope=repository:armhf/hello-world:pull" | jq --raw-output '.token')";

curl -X HEAD -I -fsSL -H "Authorization: Bearer $token" \
    -H 'Accept: application/vnd.docker.distribution.manifest.list.v2+json' \
    "https://registry-1.docker.io/v2/armhf/hello-world/manifests/latest"

HTTP/1.1 200 OK
Content-Type: application/vnd.docker.distribution.manifest.v1+prettyjws
...

However, content-negotiation is not described in the specification currently, and because of that would currently be up to each implementation how to handle Accept headers. From looking at the responses, the current implementation (Docker Hub, and the open source registry from https://github.com/docker/distribution) also looks incomplete, as the registry does not return a Vary header (rfc7231, section 7.14) (which I think it should ).

Docker Hub (and other registries that existed before introduction of the v2 schemas) may be in a slightly special position as they have to preserve backward compatibility for that reason. Other registries could make other choices when dealing with active (server-side) content-negation.

That said, I do think it would be good to have content-negotiation added to the specs, but as OPTIONAL. The reason for making it optional is that it should be possible to host a static registry (which would not be able to perform active content-negotiation). In situations where no active content-negotiation is performed by the registry, it should be handled by the client (in case of a manifest-list, the client is responsible for picking the right variant from the list).

I started writing up a draft / some ideas Yesterday, and will try to post an initial proposal for discussion (also to perform active content-negotiation for multi-arch manifests, which could benefit (e.g.) ARM architectures, where a client could be able to support multiple (v5/v6/v7/v8) variants: allowing the client to set a list of accepted architectures, with priority would save extra roundtrips to fetch the manifest list and pick the variant on the client side).

amouat · 2020-11-12T09:39:35Z

@thaJeztah are there still active clients that only work with v1 manifests? I guess they belong to bespoke CI/CD systems and the like? It would simplify life if we could drop v1 support.

I would agree that we should a section on content-type negotiation to the specification. It's a bit frustrating though, as the way you've worded it will result in different registries returning significantly different results for the same requests (as I guess happens at the minute with the Docker Hub).

thaJeztah · 2020-11-12T10:36:01Z

@amouat I don't have numbers at hand, but with billions of pulls, we definitely get old (or plain "weird") clients that connect.

It's a bit frustrating though, as the way you've worded it will result in different registries returning significantly different results for the same requests (as I guess happens at the minute with the Docker Hub).

I wonder if this can be avoided. Today's v2 will be tomorrow's v1. Content-Negotiation can help such transitions, and clients should indicate what content-types they can accept. Returning the "oldest" acceptable format if no Accept header is sent, thus if the client is non-specific (IMO) is the best option.

amouat · 2020-11-12T11:12:56Z

Personally, I'd rather never return a v1 manifest. I actually removed all support for v1 from Trow. Supporting v1 manifests is a lot more work than v2. One option might be to refuse v1 uploads but automatically convert v2 manifests to v1 if requested.

thaJeztah · 2020-11-12T11:30:02Z

One option might be to refuse v1 uploads but automatically convert v2 manifests to v1 if requested.

Actually, I may be mixing up v2, v1, and v2 schema1 (I always get confused by those); I think v1 has already been removed (https://www.docker.com/blog/docker-hub-deprecation-1-5/), but v2 schema1 is still "supported" by Hub (current versions of docker will produce a warning (19.03) or error (20.10) to recommend users to pull the schema 2 v1 manifest, and push as schema 2 v2) (moby/moby#39365, moby/moby#41295); docker 20.10 will produce;

DEPRECATED] support for pushing manifest v2 schema1 images has been removed. More information at https://docs.docker.com/registry/spec/deprecated-schema-v1/

I think the schema 2 v1 manifests are auto-generated by Hub though (again, would have to check)

thaJeztah · 2020-11-12T13:53:05Z

Cleaned up my draft a bit, and posted it as #212

amouat · 2020-11-12T15:46:30Z

Yes @thaJeztah - I was confused as well and also talking about schema 2 v1 (which is vastly different to v2).

jonjohnsonjr · 2020-11-12T16:50:46Z

No, I mean Content-Type. I'm saying that, if it is a manifest list, and no Accept has been provided, it would make a lot more sense if it defaulted to returning the manifest list rather than picking a manifest (seems like it defaults to linux/amd64) and return that.

The behavior is not obvious and kind of counter-intuitive currently, as it's not returning the type that matches my request the closest, in this case: a tag that happens to point at a manifest list.

Ah yeah, I see what you mean. That makes sense for this spec in isolation, but unfortunately (as @thaJeztah described in great detail) the Accept header was used to ease the transition from v2 schema 1 images to v2 schema 2 images, so just dropping this completely would break some backwards compatibility with older clients.

FWIW, the way we "solved" this in GCR was to do what you expect if clients send * or */* in the accept header. This resolved all of our complains about GCR being broken with curl, as curl sends Accept: */*:

$ curl -v https://gcr.io/v2/ 2>&1 | grep "Accept:"
> Accept: */*

So you get what you would expect with a manifest list:

$ curl -v https://gcr.io/v2/google-containers/debian-hyperkube-base/manifests/0.12.1 2>&1 | grep content-type
< content-type: application/vnd.docker.distribution.manifest.list.v2+json

But clients that send no Accept header at all (old docker versions) get the fallback behavior:

$ curl -v -H "Accept:" https://gcr.io/v2/google-containers/debian-hyperkube-base/manifests/0.12.1 2>&1 | grep content-type
< content-type: application/vnd.docker.distribution.manifest.v1+prettyjws

This is a somewhat janky solution because it relies on some specific client behavior, but it seems to fall within the spirit of the spec and seems to work for both curl and docker 🤷‍♂️

jdolitsky added this to the v1.0.0-rc2 milestone Oct 28, 2020

pmengelbert mentioned this issue Oct 29, 2020

Add test and spec section for HEAD requests #208

Merged

amouat mentioned this issue Oct 31, 2020

cache manifests on pull moby/moby#41607

Merged

jonjohnsonjr mentioned this issue Nov 12, 2020

Remaining issues pre v1.0.0-rc2 (Accept headers, etc.) #211

Open

thaJeztah mentioned this issue Nov 17, 2020

Feature request to add arguments --check and -q to docker pull command moby/moby#41390

Open

dmcgowan closed this as completed in #208 Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HEAD endpoints #207

Add HEAD endpoints #207

jdolitsky commented Oct 28, 2020 •

edited

Loading

dmcgowan commented Oct 29, 2020

jdolitsky commented Oct 29, 2020

simskij commented Nov 5, 2020 •

edited

Loading

amouat commented Nov 5, 2020

thaJeztah commented Nov 5, 2020

amouat commented Nov 5, 2020

jdolitsky commented Nov 5, 2020

jonjohnsonjr commented Nov 11, 2020 •

edited

Loading

jdolitsky commented Nov 11, 2020

jonjohnsonjr commented Nov 11, 2020

simskij commented Nov 12, 2020

thaJeztah commented Nov 12, 2020

amouat commented Nov 12, 2020

thaJeztah commented Nov 12, 2020

amouat commented Nov 12, 2020

thaJeztah commented Nov 12, 2020

thaJeztah commented Nov 12, 2020

amouat commented Nov 12, 2020

jonjohnsonjr commented Nov 12, 2020

Add HEAD endpoints #207

Add HEAD endpoints #207

Comments

jdolitsky commented Oct 28, 2020 • edited Loading

dmcgowan commented Oct 29, 2020

jdolitsky commented Oct 29, 2020

simskij commented Nov 5, 2020 • edited Loading

amouat commented Nov 5, 2020

thaJeztah commented Nov 5, 2020

amouat commented Nov 5, 2020

jdolitsky commented Nov 5, 2020

jonjohnsonjr commented Nov 11, 2020 • edited Loading

jdolitsky commented Nov 11, 2020

jonjohnsonjr commented Nov 11, 2020

simskij commented Nov 12, 2020

thaJeztah commented Nov 12, 2020

amouat commented Nov 12, 2020

thaJeztah commented Nov 12, 2020

amouat commented Nov 12, 2020

thaJeztah commented Nov 12, 2020

thaJeztah commented Nov 12, 2020

amouat commented Nov 12, 2020

jonjohnsonjr commented Nov 12, 2020

jdolitsky commented Oct 28, 2020 •

edited

Loading

simskij commented Nov 5, 2020 •

edited

Loading

jonjohnsonjr commented Nov 11, 2020 •

edited

Loading