-
-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nginx: if root is in Nix store, use path's hash as ETag #48337
Conversation
Success on aarch64-linux (full log) Attempted: nginx Partial log (click to expand)
|
Success on x86_64-linux (full log) Attempted: nginx Partial log (click to expand)
|
Success on x86_64-darwin (full log) Attempted: nginx Partial log (click to expand)
|
+1 let’s include it right in nixpkgs.
…Sent from my iPhone
On Oct 14, 2018, at 10:09 AM, Yegor Timoshenko ***@***.***> wrote:
@yegortimoshenko commented on this pull request.
In pkgs/servers/http/nginx/generic.nix:
> @@ -17,6 +17,12 @@ stdenv.mkDerivation {
inherit sha256;
};
+ patches = [
+ (fetchpatch {
Maybe. @grahamc, what do you think?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I believe The etag header has some semantics about how browsers should treat the value. Does this patch follow those rules?
…Sent from my iPhone
On Oct 14, 2018, at 2:23 PM, Graham Christensen ***@***.***> wrote:
+1 let’s include it right in nixpkgs.
Sent from my iPhone
> On Oct 14, 2018, at 10:09 AM, Yegor Timoshenko ***@***.***> wrote:
>
> @yegortimoshenko commented on this pull request.
>
> In pkgs/servers/http/nginx/generic.nix:
>
> > @@ -17,6 +17,12 @@ stdenv.mkDerivation {
> inherit sha256;
> };
>
> + patches = [
> + (fetchpatch {
> Maybe. @grahamc, what do you think?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Yes, I believe so. Browsers will use ETag as a validation token to ensure content didn't change, and that maps onto Nix store semantics very well. There is also some syntax involved and that's conformant too (double quotes, optionally weak prefix but it doesn't apply here). This patch seems to generate RFC 7232 compliant tags. Additionally, Nginx will also properly handle this ETag and respond with |
Included patch into the PR. |
Come to think of it, there is a concern: if Nix store path contains impure symlinks outside of Nix store, and Nginx has Meaning, my assumption was that if Situation is complicated by the fact that If this is undesirable, I think I have an idea how to make it work while keeping patch relatively small and contained: make sure that last modified time is 1970-01-01 00:00:00 before checking if root is in Nix store. |
Success on x86_64-linux (full log) Attempted: nginx Partial log (click to expand)
|
Success on x86_64-darwin (full log) Attempted: nginx Partial log (click to expand)
|
Success on aarch64-linux (full log) Attempted: nginx Partial log (click to expand)
|
I think whatever we do here, it shouldn't be the default. We can make it convenient to turn on Nix+Nginx behaviour for those that understand what it does. |
Why? If Nix store symlinks are handled properly, this shouldn't cause any regressions whatsoever. Since this Nginx is Nixpkgs-distributed, it seems that it would make sense for it to correctly work with Nix. |
Success on x86_64-linux (full log) Attempted: nginx Partial log (click to expand)
|
Success on aarch64-linux (full log) Attempted: nginx Partial log (click to expand)
|
Success on x86_64-darwin (full log) Attempted: nginx Partial log (click to expand)
|
I've pushed two commits: one adds missing This patch shouldn't cause any regressions anymore. |
Success on aarch64-linux (full log) Attempted: nginx Partial log (click to expand)
|
Success on x86_64-linux (full log) Attempted: nginx Partial log (click to expand)
|
Success on x86_64-darwin (full log) Attempted: nginx Partial log (click to expand)
|
@yegortimoshenko because we're changing expected behavior in nginx, it could have implications we can't predict. Although to be fair, current "cache forever" is pretty bad so maybe it's not that bad :) |
We should probably merge this. |
@aszlig Ack! Thank you for working on this. |
nginx: if root is in Nix store, use path's hash as ETag
nginx: if root is in Nix store, use path's hash as ETag
Since NixOS/nixpkgs#48337 we no longer have the issue that we get the same ETags for different store paths but the hash of the store path is now the ETag. This means, that we no longer need to disable caching if the patch is applied to nginx. Unfortunately, the patch is only in NixOS Unstable at the moment, hence the conditional on whether the patch exists. Even if the patch is backported to 19.03, we'd still need it in 18.09, which we currently still support as well. In order to apply the patch for eg. NixOS 19.03, something like this needs to be put in the system configuration: { pkgs, lib, ... }: { # ... other configuration definitions services.nginx.package = pkgs.nginx.overrideAttrs (drv: { patches = (drv.patches or []) ++ lib.singleton (fetchurl { url = "https://raw.githubusercontent.com/NixOS/nixpkgs/master/" + "pkgs/servers/http/nginx/nix-etag-1.15.4.patch"; sha256 = "0w7sbvfrf0s20lyfr99r5d13rd97nd3c4n569n9ldy7a1r7nx019"; }); }); # ... other configuration definitions } Signed-off-by: aszlig <aszlig@nix.build>
Heads-up everyone: After testing this in a few production instances, it seems that some browsers still get cache hits for new store paths (and changed contents) for some reason. I highly suspect that it might be due to the Going to test this with |
ETag should take precedence over Last-Modified: https://stackoverflow.com/questions/824152/what-takes-precedence-the-etag-or-last-modified-http-header/1560098 |
@yegortimoshenko: Yep, in theory it should, see also #48337 (comment). |
Some browsers indeed show very strange caching behavior when both ETag and last-modified are present. At least Chrome seems to keep etag-ged file for a while when last-modified is old |
@juskrey: Mhm, I just could confirm exactly this. I think we also need to disable last-modified if we provide an ETag from the store. |
Here is a small NixOS VM test reproducing the issue even on Firefox, pull request with the test and fix following soon... |
This is what I've suspected a while ago[1]: > Heads-up everyone: After testing this in a few production instances, > it seems that some browsers still get cache hits for new store paths > (and changed contents) for some reason. I highly suspect that it might > be due to the last-modified header (as mentioned in [2]). > > Going to test this with last-modified disabled for a little while and > if this is the case I think we should improve that patch by disabling > last-modified if serving from a store path. Much earlier[2] when I reviewed the patch, I wrote this: > Other than that, it looks good to me. > > However, I'm not sure what we should do with Last-Modified header. > From RFC 2616, section 13.3.4: > > - If both an entity tag and a Last-Modified value have been > provided by the origin server, SHOULD use both validators in > cache-conditional requests. This allows both HTTP/1.0 and > HTTP/1.1 caches to respond appropriately. > > I'm a bit nervous about the SHOULD here, as user agents in the wild > could possibly just use Last-Modified and use the cached content > instead. Unfortunately, I didn't pursue this any further back then because @pbogdan noted[3] the following: > Hmm, could they (assuming they are conforming): > > * If an entity tag has been provided by the origin server, MUST > use that entity tag in any cache-conditional request (using If- > Match or If-None-Match). Since running with this patch in some deployments, I found that both Firefox and Chrome/Chromium do NOT re-validate against the ETag if the Last-Modified header is still the same. So I wrote a small NixOS VM test with Geckodriver to have a test case which is closer to the real world and I indeed was able to reproduce this. Whether this is actually a bug in Chrome or Firefox is an entirely different issue and even IF it is the fault of the browsers and it is fixed at some point, we'd still need to handle this for older browser versions. Apart from clearing the header, I also recreated the patch by using a plain "git diff" with a small description on top. This should make it easier for future authors to work on that patch. [1]: NixOS#48337 (comment) [2]: NixOS#48337 (comment) [3]: NixOS#48337 (comment) Signed-off-by: aszlig <aszlig@nix.build>
Done: #76697 |
This is what I've suspected a while ago[1]: > Heads-up everyone: After testing this in a few production instances, > it seems that some browsers still get cache hits for new store paths > (and changed contents) for some reason. I highly suspect that it might > be due to the last-modified header (as mentioned in [2]). > > Going to test this with last-modified disabled for a little while and > if this is the case I think we should improve that patch by disabling > last-modified if serving from a store path. Much earlier[2] when I reviewed the patch, I wrote this: > Other than that, it looks good to me. > > However, I'm not sure what we should do with Last-Modified header. > From RFC 2616, section 13.3.4: > > - If both an entity tag and a Last-Modified value have been > provided by the origin server, SHOULD use both validators in > cache-conditional requests. This allows both HTTP/1.0 and > HTTP/1.1 caches to respond appropriately. > > I'm a bit nervous about the SHOULD here, as user agents in the wild > could possibly just use Last-Modified and use the cached content > instead. Unfortunately, I didn't pursue this any further back then because @pbogdan noted[3] the following: > Hmm, could they (assuming they are conforming): > > * If an entity tag has been provided by the origin server, MUST > use that entity tag in any cache-conditional request (using If- > Match or If-None-Match). Since running with this patch in some deployments, I found that both Firefox and Chrome/Chromium do NOT re-validate against the ETag if the Last-Modified header is still the same. So I wrote a small NixOS VM test with Geckodriver to have a test case which is closer to the real world and I indeed was able to reproduce this. Whether this is actually a bug in Chrome or Firefox is an entirely different issue and even IF it is the fault of the browsers and it is fixed at some point, we'd still need to handle this for older browser versions. Apart from clearing the header, I also recreated the patch by using a plain "git diff" with a small description on top. This should make it easier for future authors to work on that patch. [1]: NixOS#48337 (comment) [2]: NixOS#48337 (comment) [3]: NixOS#48337 (comment) Signed-off-by: aszlig <aszlig@nix.build> (cherry picked from commit ccf55be)
This is what I've suspected a while ago[1]: > Heads-up everyone: After testing this in a few production instances, > it seems that some browsers still get cache hits for new store paths > (and changed contents) for some reason. I highly suspect that it might > be due to the last-modified header (as mentioned in [2]). > > Going to test this with last-modified disabled for a little while and > if this is the case I think we should improve that patch by disabling > last-modified if serving from a store path. Much earlier[2] when I reviewed the patch, I wrote this: > Other than that, it looks good to me. > > However, I'm not sure what we should do with Last-Modified header. > From RFC 2616, section 13.3.4: > > - If both an entity tag and a Last-Modified value have been > provided by the origin server, SHOULD use both validators in > cache-conditional requests. This allows both HTTP/1.0 and > HTTP/1.1 caches to respond appropriately. > > I'm a bit nervous about the SHOULD here, as user agents in the wild > could possibly just use Last-Modified and use the cached content > instead. Unfortunately, I didn't pursue this any further back then because @pbogdan noted[3] the following: > Hmm, could they (assuming they are conforming): > > * If an entity tag has been provided by the origin server, MUST > use that entity tag in any cache-conditional request (using If- > Match or If-None-Match). Since running with this patch in some deployments, I found that both Firefox and Chrome/Chromium do NOT re-validate against the ETag if the Last-Modified header is still the same. So I wrote a small NixOS VM test with Geckodriver to have a test case which is closer to the real world and I indeed was able to reproduce this. Whether this is actually a bug in Chrome or Firefox is an entirely different issue and even IF it is the fault of the browsers and it is fixed at some point, we'd still need to handle this for older browser versions. Apart from clearing the header, I also recreated the patch by using a plain "git diff" with a small description on top. This should make it easier for future authors to work on that patch. [1]: #48337 (comment) [2]: #48337 (comment) [3]: #48337 (comment) Signed-off-by: aszlig <aszlig@nix.build> (cherry picked from commit ccf55be) Reason: The issue breaks setups that serve static content via Nix store paths. I've also backported the NixOS VM test from Python to Perl.
While our ETag patch works pretty fine if it comes to serving data off store paths, it unfortunately broke something that might be a bit more common, namely when using regexes to extract path components of location directives for example. Recently, @devhell has reported a bug with a nginx location directive like this: location ~^/\~([a-z0-9_]+)(/.*)?$" { alias /home/$1/public_html$2; } While this might look harmless at first glance, it does however cause issues with our ETag patch. The alias directive gets broken up by nginx like this: *2 http script copy: "/home/" *2 http script capture: "foo" *2 http script copy: "/public_html/" *2 http script capture: "bar.txt" In our patch however, we use realpath(3) to get the canonicalised path from ngx_http_core_loc_conf_s.root, which returns the *configured* value from the root or alias directive. So in the example above, realpath(3) boils down to the following syscalls: lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory) During my review[1] of the initial patch, I didn't actually notice that what we're doing here is returning NGX_ERROR if the realpath(3) call fails, which in turn causes an HTTP 500 error. Since our patch actually made the canonicalisation (and thus additional syscalls) necessary, we really shouldn't introduce an additional error so let's - at least for now - silently skip return value if realpath(3) has failed. However since we're using the unaltered root from the config we have another issue, consider this root: /nix/store/...-abcde/$1 Calling realpath(3) on this path will fail (except if there's a file called "$1" of course), so even this fix is not enough because it results in the ETag not being set to the store path hash. While this is very ugly and we should fix this very soon, it's not as serious as getting HTTP 500 errors for serving static files. I added a small NixOS VM test, which uses the example above as a regression test. It seems that my memory is failing these days, since apparently I *knew* about this issue since digging for existing issues in nixpkgs, I found this similar pull request which I even reviewed: NixOS#66532 However, since the comments weren't addressed and the author hasn't responded to the pull request, I decided to keep this very commit and do a follow-up pull request. [1]: NixOS#48337 Signed-off-by: aszlig <aszlig@nix.build> Reported-by: @devhell (cherry picked from commit f6c7635)
While our ETag patch works pretty fine if it comes to serving data off store paths, it unfortunately broke something that might be a bit more common, namely when using regexes to extract path components of location directives for example. Recently, @devhell has reported a bug with a nginx location directive like this: location ~^/\~([a-z0-9_]+)(/.*)?$" { alias /home/$1/public_html$2; } While this might look harmless at first glance, it does however cause issues with our ETag patch. The alias directive gets broken up by nginx like this: *2 http script copy: "/home/" *2 http script capture: "foo" *2 http script copy: "/public_html/" *2 http script capture: "bar.txt" In our patch however, we use realpath(3) to get the canonicalised path from ngx_http_core_loc_conf_s.root, which returns the *configured* value from the root or alias directive. So in the example above, realpath(3) boils down to the following syscalls: lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory) During my review[1] of the initial patch, I didn't actually notice that what we're doing here is returning NGX_ERROR if the realpath(3) call fails, which in turn causes an HTTP 500 error. Since our patch actually made the canonicalisation (and thus additional syscalls) necessary, we really shouldn't introduce an additional error so let's - at least for now - silently skip return value if realpath(3) has failed. However since we're using the unaltered root from the config we have another issue, consider this root: /nix/store/...-abcde/$1 Calling realpath(3) on this path will fail (except if there's a file called "$1" of course), so even this fix is not enough because it results in the ETag not being set to the store path hash. While this is very ugly and we should fix this very soon, it's not as serious as getting HTTP 500 errors for serving static files. I added a small NixOS VM test, which uses the example above as a regression test. It seems that my memory is failing these days, since apparently I *knew* about this issue since digging for existing issues in nixpkgs, I found this similar pull request which I even reviewed: #66532 However, since the comments weren't addressed and the author hasn't responded to the pull request, I decided to keep this very commit and do a follow-up pull request. [1]: #48337 Signed-off-by: aszlig <aszlig@nix.build> Reported-by: @devhell Acked-by: @7c6f434c Acked-by: @yorickvP Merges: #80671 Fixes: #66532
While our ETag patch works pretty fine if it comes to serving data off store paths, it unfortunately broke something that might be a bit more common, namely when using regexes to extract path components of location directives for example. Recently, @devhell has reported a bug with a nginx location directive like this: location ~^/\~([a-z0-9_]+)(/.*)?$" { alias /home/$1/public_html$2; } While this might look harmless at first glance, it does however cause issues with our ETag patch. The alias directive gets broken up by nginx like this: *2 http script copy: "/home/" *2 http script capture: "foo" *2 http script copy: "/public_html/" *2 http script capture: "bar.txt" In our patch however, we use realpath(3) to get the canonicalised path from ngx_http_core_loc_conf_s.root, which returns the *configured* value from the root or alias directive. So in the example above, realpath(3) boils down to the following syscalls: lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory) During my review[1] of the initial patch, I didn't actually notice that what we're doing here is returning NGX_ERROR if the realpath(3) call fails, which in turn causes an HTTP 500 error. Since our patch actually made the canonicalisation (and thus additional syscalls) necessary, we really shouldn't introduce an additional error so let's - at least for now - silently skip return value if realpath(3) has failed. However since we're using the unaltered root from the config we have another issue, consider this root: /nix/store/...-abcde/$1 Calling realpath(3) on this path will fail (except if there's a file called "$1" of course), so even this fix is not enough because it results in the ETag not being set to the store path hash. While this is very ugly and we should fix this very soon, it's not as serious as getting HTTP 500 errors for serving static files. I added a small NixOS VM test, which uses the example above as a regression test. It seems that my memory is failing these days, since apparently I *knew* about this issue since digging for existing issues in nixpkgs, I found this similar pull request which I even reviewed: #66532 However, since the comments weren't addressed and the author hasn't responded to the pull request, I decided to keep this very commit and do a follow-up pull request. [1]: #48337 Signed-off-by: aszlig <aszlig@nix.build> Reported-by: @devhell Acked-by: @7c6f434c Acked-by: @yorickvP Merges: #80671 Fixes: #66532 (cherry picked from commit e1d63ad)
While our ETag patch works pretty fine if it comes to serving data off store paths, it unfortunately broke something that might be a bit more common, namely when using regexes to extract path components of location directives for example. Recently, @devhell has reported a bug with a nginx location directive like this: location ~^/\~([a-z0-9_]+)(/.*)?$" { alias /home/$1/public_html$2; } While this might look harmless at first glance, it does however cause issues with our ETag patch. The alias directive gets broken up by nginx like this: *2 http script copy: "/home/" *2 http script capture: "foo" *2 http script copy: "/public_html/" *2 http script capture: "bar.txt" In our patch however, we use realpath(3) to get the canonicalised path from ngx_http_core_loc_conf_s.root, which returns the *configured* value from the root or alias directive. So in the example above, realpath(3) boils down to the following syscalls: lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory) During my review[1] of the initial patch, I didn't actually notice that what we're doing here is returning NGX_ERROR if the realpath(3) call fails, which in turn causes an HTTP 500 error. Since our patch actually made the canonicalisation (and thus additional syscalls) necessary, we really shouldn't introduce an additional error so let's - at least for now - silently skip return value if realpath(3) has failed. However since we're using the unaltered root from the config we have another issue, consider this root: /nix/store/...-abcde/$1 Calling realpath(3) on this path will fail (except if there's a file called "$1" of course), so even this fix is not enough because it results in the ETag not being set to the store path hash. While this is very ugly and we should fix this very soon, it's not as serious as getting HTTP 500 errors for serving static files. I added a small NixOS VM test, which uses the example above as a regression test. It seems that my memory is failing these days, since apparently I *knew* about this issue since digging for existing issues in nixpkgs, I found this similar pull request which I even reviewed: #66532 However, since the comments weren't addressed and the author hasn't responded to the pull request, I decided to keep this very commit and do a follow-up pull request. [1]: #48337 Signed-off-by: aszlig <aszlig@nix.build> Reported-by: @devhell Acked-by: @7c6f434c Acked-by: @yorickvP Merges: #80671 Fixes: #66532 (cherry picked from commit e1d63ad)
While our ETag patch works pretty fine if it comes to serving data off store paths, it unfortunately broke something that might be a bit more common, namely when using regexes to extract path components of location directives for example. Recently, @devhell has reported a bug with a nginx location directive like this: location ~^/\~([a-z0-9_]+)(/.*)?$" { alias /home/$1/public_html$2; } While this might look harmless at first glance, it does however cause issues with our ETag patch. The alias directive gets broken up by nginx like this: *2 http script copy: "/home/" *2 http script capture: "foo" *2 http script copy: "/public_html/" *2 http script capture: "bar.txt" In our patch however, we use realpath(3) to get the canonicalised path from ngx_http_core_loc_conf_s.root, which returns the *configured* value from the root or alias directive. So in the example above, realpath(3) boils down to the following syscalls: lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory) During my review[1] of the initial patch, I didn't actually notice that what we're doing here is returning NGX_ERROR if the realpath(3) call fails, which in turn causes an HTTP 500 error. Since our patch actually made the canonicalisation (and thus additional syscalls) necessary, we really shouldn't introduce an additional error so let's - at least for now - silently skip return value if realpath(3) has failed. However since we're using the unaltered root from the config we have another issue, consider this root: /nix/store/...-abcde/$1 Calling realpath(3) on this path will fail (except if there's a file called "$1" of course), so even this fix is not enough because it results in the ETag not being set to the store path hash. While this is very ugly and we should fix this very soon, it's not as serious as getting HTTP 500 errors for serving static files. I added a small NixOS VM test, which uses the example above as a regression test. It seems that my memory is failing these days, since apparently I *knew* about this issue since digging for existing issues in nixpkgs, I found this similar pull request which I even reviewed: NixOS#66532 However, since the comments weren't addressed and the author hasn't responded to the pull request, I decided to keep this very commit and do a follow-up pull request. [1]: NixOS#48337 Signed-off-by: aszlig <aszlig@nix.build> Reported-by: @devhell Acked-by: @7c6f434c Acked-by: @yorickvP Merges: NixOS#80671 Fixes: NixOS#66532 (cherry picked from commit e1d63ad)
Motivation for this change
Resolves #25485. Usage example:
cc @Akii @chris-martin @domenkozar
P.S. I didn't commit patch into Nixpkgs repo because that seems to be the preference, but it would be nice to have the patch reviewed. Tell me if I should commit it anyway.
Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nox --run "nox-review wip"
./result/bin/
)nix path-info -S
before and after)