Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nginx: Clear Last-Modified if ETag is from store #76697

Merged
merged 1 commit into from
Jan 2, 2020

Conversation

aszlig
Copy link
Member

@aszlig aszlig commented Dec 30, 2019

This is what I've suspected a while ago:

Heads-up everyone: After testing this in a few production instances, it seems that some browsers still get cache hits for new store paths (and changed contents) for some reason. I highly suspect that it might be due to the last-modified header (as mentioned in #48337 (comment)).

Going to test this with last-modified disabled for a little while and if this is the case I think we should improve that patch by disabling last-modified if serving from a store path.

Much earlier when I reviewed the patch, I wrote this:

Other than that, it looks good to me.

However, I'm not sure what we should do with Last-Modified header.
From RFC 2616, section 13.3.4:

  • If both an entity tag and a Last-Modified value have been
    provided by the origin server, SHOULD use both validators in
    cache-conditional requests. This allows both HTTP/1.0 and
    HTTP/1.1 caches to respond appropriately.

I'm a bit nervous about the SHOULD here, as user agents in the wild could possibly just use Last-Modified and use the cached content instead.

Unfortunately, I didn't pursue this any further back then because @pbogdan noted the following:

Hmm, could they (assuming they are conforming):

  • If an entity tag has been provided by the origin server, MUST
    use that entity tag in any cache-conditional request (using If-
    Match or If-None-Match).

Since running with this patch in some deployments, I found that both Firefox and Chrome/Chromium do NOT re-validate against the ETag if the Last-Modified header is still the same.

So I wrote a small NixOS VM test with Geckodriver to have a test case which is closer to the real world and I indeed was able to reproduce this.

Whether this is actually a bug in Chrome or Firefox is an entirely different issue and even IF it is the fault of the browsers and it is fixed at some point, we'd still need to handle this for older browser versions.

Apart from clearing the header, I also recreated the patch by using a plain git diff with a small description on top. This should make it easier for future authors to work on that patch.

Cc: @juskrey

This is what I've suspected a while ago[1]:

> Heads-up everyone: After testing this in a few production instances,
> it seems that some browsers still get cache hits for new store paths
> (and changed contents) for some reason. I highly suspect that it might
> be due to the last-modified header (as mentioned in [2]).
>
> Going to test this with last-modified disabled for a little while and
> if this is the case I think we should improve that patch by disabling
> last-modified if serving from a store path.

Much earlier[2] when I reviewed the patch, I wrote this:

> Other than that, it looks good to me.
>
> However, I'm not sure what we should do with Last-Modified header.
> From RFC 2616, section 13.3.4:
>
> - If both an entity tag and a Last-Modified value have been
>   provided by the origin server, SHOULD use both validators in
>   cache-conditional requests. This allows both HTTP/1.0 and
>   HTTP/1.1 caches to respond appropriately.
>
> I'm a bit nervous about the SHOULD here, as user agents in the wild
> could possibly just use Last-Modified and use the cached content
> instead.

Unfortunately, I didn't pursue this any further back then because
@pbogdan noted[3] the following:

> Hmm, could they (assuming they are conforming):
>
>  * If an entity tag has been provided by the origin server, MUST
>    use that entity tag in any cache-conditional request (using If-
>    Match or If-None-Match).

Since running with this patch in some deployments, I found that both
Firefox and Chrome/Chromium do NOT re-validate against the ETag if the
Last-Modified header is still the same.

So I wrote a small NixOS VM test with Geckodriver to have a test case
which is closer to the real world and I indeed was able to reproduce
this.

Whether this is actually a bug in Chrome or Firefox is an entirely
different issue and even IF it is the fault of the browsers and it is
fixed at some point, we'd still need to handle this for older browser
versions.

Apart from clearing the header, I also recreated the patch by using a
plain "git diff" with a small description on top. This should make it
easier for future authors to work on that patch.

[1]: NixOS#48337 (comment)
[2]: NixOS#48337 (comment)
[3]: NixOS#48337 (comment)

Signed-off-by: aszlig <aszlig@nix.build>
@aszlig
Copy link
Member Author

aszlig commented Dec 30, 2019

@GrahamcOfBorg test nginx-etag

@gebner
Copy link
Member

gebner commented Dec 30, 2019

I'm not sure I understand the issue completely. In the test, we construct two directories, /nix/store/aaaaaaaaaa-testdir and /nix/store/bbbbbbbbbb-testdir2, each containing an index.html and a test.js file. Then the following happens:

  1. We start nginx with root /nix/store/aaaaaaaaaa-testdir.
  2. Firefox requests /test.js and receives Last-Modified: 1 and ETag: aaaaaaaaaa
  3. We change the nginx configuration to root /nix/store/bbbbbbbbbb-testdir2, which contains a test.js file with different content.
  4. Firefox requests /test.js and receives Last-Modified: 1 and ETag: bbbbbbbbbb
  5. The Last-Modified header did not change, so firefox uses the cached version of test.js. (Which is bad, since test.js now points to a different file.)

The PR then fixes this issue by removing the Last-Modified header, right?

However AFAICT this still doesn't fix caching completely: e.g. if testdir and testdir2 were both created by the same derivation. That is, we change the root from /nix/store/cccccccccc-combined/testdir to /nix/store/cccccccccc-combined/testdir2. Then nginx would always respond with ETag: cccccccccc no matter whether we're in testdir or testdir2 and firefox would still use the cached version.

@aszlig
Copy link
Member Author

aszlig commented Dec 30, 2019

@gebner:

2. Firefox requests `/test.js` and receives `Last-Modified: 1` and `ETag: aaaaaaaaaa`

Yes, but this is exactly what this is fixing, because this was the behaviour prior to the fix and causes Firefox to load it from cache even if ETag is bbbbbbbbbb. So the fix simply strips the Last-Modified header.

The PR then fixes this issue by removing the Last-Modified header, right?

Correct.

However AFAICT this still doesn't fix caching completely: e.g. if testdir and testdir2 were both created by the same derivation. That is, we change the root from /nix/store/cccccccccc-combined/testdir to /nix/store/cccccccccc-combined/testdir2. Then nginx would always respond with ETag: cccccccccc no matter whether we're in testdir or testdir2 and firefox would still use the cached version.

This is correct, however in practice something like this should rarely (if at all) happen and in almost all cases this would result in a different store path. The only thing that I could think of right now, is if you got the wrong directory, run into a HTTP 404 and fix the root directory... however in this case no caching would be involved.

So I think this is only a problem if you request something like /index.html from testdir first, realise you got the wrong path, change it to testdir2 (which also has an index.html) and then you'd end up getting cached content from the browser.

Do you know a practical example where something like this would happen more frequently?

Copy link
Member

@domenkozar domenkozar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a backport for 19.09.

aszlig added a commit that referenced this pull request Jan 2, 2020
This fixes the patch for nginx to clear the Last-Modified header if a
static file is served from the Nix store.

So far we only used the ETag from the store path, but if the
Last-Modified header is always set to "Thu, 01 Jan 1970 00:00:01 GMT",
Firefox and Chrome/Chromium seem to ignore the ETag and simply use the
cached content instead of revalidating.

Alongside the fix, this also adds a dedicated NixOS VM test, which uses
WebDriver and Firefox to check whether the content is actually served
from the browser's cache and to have a more real-world test case.
@aszlig aszlig merged commit ccf55be into NixOS:master Jan 2, 2020
@aszlig
Copy link
Member Author

aszlig commented Jan 2, 2020

Backported in f7bc988.

@aszlig aszlig deleted the nginx-etag-last-modified-fix branch January 24, 2020 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants