-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Document ETag and Precondition Semantics #250
Comments
@awakecoding might make sense to continue your ETag discussion here. |
@jonjohnsonjr @SteveLasker excellent, let's discuss it here! Here is my primary goal: find a way to keep browser Etag compatibility (opaque value) while also passing a non-opaque Etag-like digest-based value, making it possible to implement caching without the need for associative data attached to a specific file, especially for manifests. By this I mean being able to correctly recover the Etag-like value associated with a cached manifest file on disk without some additional metadata to store the corresponding Etag. This is very similar to the blobs that are digest-based, except made usable for manifests, and would greatly simplify efficient manifest update checks without uselessly pulling the manifest contents. The Etag standard HTTP header is opaque by definition. Browsers and HTTP caches implement this mechanism such that if you cache an HTTP response along with the ETag value, you can make the same query again including the ETag value, and avoid receiving the response contents if it hasn't changed. This mechanism is almost perfect aside from the fact that the Etag value is opaque by definition, meaning we can't just declare it to be digest-based, and therefore non-opaque. A proper ETag implementation doesn't try interpreting the ETag value, it just stores it as a tag alongside the cache data. This being said, ETag values are often digest-based, even though you cannot rely on them being digest-based. Here are the ETag and Docker-Content-Digest headers from a response I captured in Fiddler between the ORAS CLI and Azure Container Registry:
For some odd reason, the ETag is the same as the Docker-Content-Digest except that it is within double quotes. Now if you look at these values, it can be easy to think that one may just try interpreting them as digest values anyway, but we would need a standard approach to disambiguation between opaque and non-opaque values passed within the Etag header, so it's still a dangerous thing to try. Even then, a server may not implement non-opaque Etag headers, which would result in constant Etag mismatches if the client keeps rebuilding the value from the content digests. Opaque Etag headers it is for that reason. Here is my suggestion: use a header very similar to the "Docker-Content-Digest" that works like the ETag header, with the exception that is is defined to non-opaque and digest-based. Implementations supporting this ETag variant would interpret its value to be digest-based. The same value would be inserted into the ETag header, but one wouldn't be able to assume that standard Etag header is digest-based. This would make it possible for implementations to work using only the "OCI-Content-Digest" header where the value is not stored in metadata, but simply recovered by rehashing the local manifest contents. Any thoughts? |
I believe this is a requirement from the RFC.
This is somewhat reasonable to me, but I wonder if we can avoid defining a new header. It seems to me that I would be fine with some SHOULD language around making the I could also see something similar to #251 (comment) where we have registries return a header to indicate that they follow this deterministic etag behavior. |
|
I think down-conversion affecting the ETag is fine and it probably should. If you are an old client that gets served a down-converted manifest, you probably should not be mutating the tag to point to a new value based on what was returned to you. As a client, if you're unable to even pull down the manifest in its "pushed" form, modifying it is almost certainly unsafe.
I think that's implicit because the |
@jonjohnsonjr aren't Etag headers something that can also be inserted by caching proxies? I wonder if caching proxies can also optionally decide to replace Etag headers with their own if the origin already set an Etag. Because they are by definition opaque it's hard to predict what can be done with them in all cases. |
https://tools.ietf.org/html/rfc2616?spm=5176.doc32013.2.3.Aimyd7#section-13.5.2
To your point, though, we may want to talk about the |
@sargun - Just checked with ACR and it does do digest based ETAGs. I would good to describe the S3 issue on this thread.
|
The concern raised was a replicated registry server with an S3 backing cannot guarantee the current state of the ETag since another replica of the registry server may be modifying the tag pointer. The next step is to follow up with the distribution maintainers on the feasibility to implement this. |
punting to 1.2 |
One of the most important checks a registry client is doing regularly are checking if a tag has changed (all other artifacts are immutable). So having a (HTTP compatible) way to do this should be mentioned explicitly in the spec - especially since there is currently no way (OCI branded header) to do that without “Docker-“ headers. unfortunately the obvious/clear usage of digest as e-Tag together with HEAD and/or conditional GET it is either specified that you need to cache the Etage value independent from the hash OR define the format of the ETag (which is kind of layering violation). Personally I am for specifying that a GET+HEAD on a tag must return the digest as ETag and it should support Also question is is Last-Modified should be mandated. BTw it’s unfortunate that you can’t see the digest a label points to in the list api (also mentioned here: #320 (comment)) |
These are just part of HTTP already, so we don't really need to define them (IMO), but it would be nice if clients could rely on these headers.
We discussed this on the dev call a few weeks ago, filing this as a placeholder for a coming pull request to clearly document the expect behavior so that registry implementations can start to follow it.
The text was updated successfully, but these errors were encountered: