Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow back-forward cache to no-store pages as long as cookies do not change #7189

Open
fergald opened this issue Oct 8, 2021 · 14 comments
Open

Comments

@fergald
Copy link

fergald commented Oct 8, 2021

Forking this off from #5744 and #5879.

Looking for feedback. I'm not sure if this is something we would try to spec or not but it's something that could increase BFCache hit-rate.

Cache-control: no-store (CCNS) is the largest BFCache-blocker in Android Chrome. About 15% of history navigations are blocked by CCNS (6% just that reason, 9% are combined with other reasons too, 80-90% of which are not spec-blockers, just things Chrome needs fix but some are timeouts). We do not check if CCNS is present on a subframe, only on the main frame.

The problem that occurred when safari allowed CCNS pages into BFCache was that, on shared devices, the next user could go back to the previous user's logged-in page. It's also a problem that pages in BFCache may be out of date when significant state changes. Cookies usually capture the fact that this state has changed and for the login/logout state, which is critical, they should always capture it.

On Android Chrome we ran an experiment to measure how often cookies change when in BFCache. We considered cookies that are accessible by the main frame of the page (since we don't block based on subframe CCNS) and we broke it down by HTTP-only cookies and other cookies. The results were very promising.

Of the 15% blocked, we saw that less than 1% had HTTP-only cookie modification while in BFCache and less than 6% had any cookie modification while in BFCache. To be clear, the pages in BFCache did not modify the cookies, some other active page on the same site did.

The most aggressive strategy would be evict CCNS pages only if a HTTP-only cookie is modified. This assumes that sites follow best practice and use HTTP-only cookies for login and other critical info and would increase the hit-rate recover about about 12 percentage points of cache hit-rate.

More conservative would be to evict CCNS pages if any cookie is modified.

There are upcoming changes to authentication (WebAuthn), however I believe these only impact how the user acquires their auth cookie but will not remove the need for auth cookies in the future (if something did, then I expect we would want to use it as an eviction signal too).

cc @annevk @smaug---- @mystor @cdumez @beidson @hober @altimin @xharaken @rubberyuzu @rakina @domenic

@annevk
Copy link
Member

annevk commented Oct 12, 2021

That seems like an interesting middle-ground. Websites might still want to preserve these entries though if they know their user is not on a shared device (this is sometimes asked when logging in). Should there be a way to do that?

@fergald
Copy link
Author

fergald commented Oct 12, 2021

We could add an API where the site names the cookies that matter (which could be the empty set or some other way to explicitly say that cookies don't matter). I wonder how much use that would get though.

@domenic
Copy link
Member

domenic commented Oct 12, 2021

So right now the spec doesn't say anything about Cache-Control and bfcache at all, IIUC. So this is kind of a sub-question of how to resolve #5879, I believe. I'll add a relevant comment on the general problem space over there...

@fergald
Copy link
Author

fergald commented Oct 15, 2021

I think we can add tests for scenarios like this that are not explicitly specced one way or the other so that devs can see what is supported. So in this case, one test would set CCNS and navigate away and back currently all browsers would give precondition failed but those that implement some kind of CCNS caching would start to pass.

@fergald
Copy link
Author

fergald commented Jul 15, 2022

Explainer for this here.

TL;DR Adds an API to specify what cookies matter and we evict all pages on that origin if one of those cookies change. With that in place, we propose to allow all pages with Cache-Control: no-store enter BFCache (with some caveats).

We really want feedback on:

  • the functionality of evicting on cookie change vs the alternative of providing an explicit API to evict
  • moving to caching documents that have Cache-Control: No-Store
  • where this will go wrong that we have missed
    • risky cases we have missed
    • risky cases that are much more common than we think
    • cases that are impossible to mitigate

@fergald
Copy link
Author

fergald commented Jul 15, 2022

CCing a few people who responded on #5744 but were not already CCed on this

@mystor @jakearchibald @geoffreygaren @jakub-g

@smaug----
Copy link

smaug---- commented Jul 15, 2022

I don't pretend to understand the explainer.
"eviction is triggered on a change". If a page is in bfcache (from which it can be evicted), how can a cookie change?

Ah, I see perhaps "Key scenarios" explains it.

@fergald
Copy link
Author

fergald commented Jul 15, 2022

The cookies can change in other open instances of the site (other windows/tabs). They can also expire.

@fergald
Copy link
Author

fergald commented Jul 15, 2022

Actually, it can even be the same window/tab. The original example Apple talked about was being on somebank.com/page and hitting logout. So somebank.com/page is in the cache now but the cookies have been deleted by the headers returned by the logout URL. With this proposal it would be evicted due to the cookie change. Other tabs/windows are also a problem.

@smaug----
Copy link

It wouldn't be to far fetched to think of a site which uses no-store explicitly to prevent bfcache. And it uses it so that loading a new page to the same tab won't keep the old page in bfcache. (Users don't explicitly logout)

I'm just a tiny bit worried, since no-store has been the way to block bfcache since 2005.

@fergald
Copy link
Author

fergald commented Jul 15, 2022

Yes, that is definitely a concern with the follow-on proposal. We expect that some things will go wrong. We believe they are all fixable and that we can avoid sensitive cases by monitoring cookies. So the question is whether the benefit (around 14% of history navigations on Android switching from non-cached to cached, I don't have the desktop number handy) outweighs the danger.

Concrete examples of sites that block BFCache wit CCNS for legitimate reasons would be helpful.

In #5744 there seemed to be broad agreement that straight-up blocking BFCache is not a feature we want to provide but current CCNS is that.

Our own user research indicates quite a bit of confusion among web developers about when/why to use CCNS and how it interacts with BFCache. We even had some sites who were setting the header remove it in response to our questions.

Feedback on just the cookie API (and not the follow-on) is also appreciated since there it gives people a way to opt in to BFCache while still setting CCNS.

@Maxim-Mazurok
Copy link

This would create a problem with versioning if the page has references to dynamically loaded content. For example, with AngularJS it might have conditional ng-include="my-template.html". So if the user clicks on something - it will attempt to load that HTML from the server. And if we release a new version where that template is removed - it will result in 404. So I would highly recommend having an easy way for web apps to communicate that the main resource (aka entry page) shouldn't be cached.
It is frustrating enough already that I can't use no-cache for this situation, because the browser won't make a request to get 304. Now due to the "optimisation", I have to do no-store so that it's re-fetched every time, and now it has to be 200 and not 304, which would've been better.

@fergald
Copy link
Author

fergald commented May 8, 2024

This would create a problem with versioning if the page has references to dynamically loaded content. For example, with AngularJS it might have conditional ng-include="my-template.html". So if the user clicks on something - it will attempt to load that HTML from the server. And if we release a new version where that template is removed - it will result in 404.

Don't you have this problem already without BFCache? If:

  • I load the page
  • you release a new version
  • I click

it will get a 404. Obviously BFCache extends the effective lifetime of pages, so it makes this problem worse but the real cause seems to be the lack of versioning and immediately deleting all of the old resources when you deploy changes.

@Maxim-Mazurok
Copy link

This would create a problem with versioning if the page has references to dynamically loaded content. For example, with AngularJS it might have conditional ng-include="my-template.html". So if the user clicks on something - it will attempt to load that HTML from the server. And if we release a new version where that template is removed - it will result in 404.

Don't you have this problem already without BFCache? If:

  • I load the page
  • you release a new version
  • I click

it will get a 404. Obviously BFCache extends the effective lifetime of pages, so it makes this problem worse but the real cause seems to be the lack of versioning and immediately deleting all of the old resources when you deploy changes.

True, maybe it wasn't the best example. Honestly, I lost track of all possible combinations and cases. All I want is for the browser to play by the rules, so that if I say no-cache - it will always re-validate with the server. BFCache, as well as a regular Back/Forward/Restart do not fully respect my no-cache header. They only respect no-store for the entry HTML page, but all the resources CSS/JS resources on that page will retrieved from cache without ever hitting my server to revalidate. That is unless I load these resources dynamically, maybe I also have to add query string, don't remember now. Either way, it's hard enough to keep in mind all the possible cases and ways that resources can be loaded, and then on top of that I need to think about all the different aggressive caching strategies that Chrome uses, with little to no documentation on how to avoid them. My bank http://anz.com.au/ has disabled "back" button which I found very frustrating as a user, but now being in the developer's shoes I understand where they're coming from. They probably just don't want to deal with defects and issues caused by the old version, just like I do.
(I might've mixed up no-cache/no-store, but hopefully you get the idea).
We had a bit of a brainstorm with the team and it seems like the best thing to ensure that user is using the latest version is to somehow attach version info from the FE to every request, probably via query string, and then somehow handle an exception that server should produce for old versions. This also means creating some custom resource-loader for CSS/JS/HTML templates. Possible with the help of Service Worker, but it's not always guaranteed to work, like on the first visit, etc.
Anyway, this issue is probably not the place to discuss proper versioning in the web, though I would appreciate any pointers and tips (send to maxim@mazurok.com please).

The summary is:

  • make web more predictable by respecting developer decisions (respect no-cache/no-store)
  • if you really believe that the majority of devs are wrong, and you don't want to respect no-store - give an ability for devs to affirm that they actually want to use no-store, maybe create no-store-even-for-bfcache or something, to give us more control when we need it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants