Skip to content
This repository has been archived by the owner on Dec 12, 2020. It is now read-only.

Update "should site-key anyway" checks to track the BCG #15

Merged
merged 3 commits into from
May 6, 2020

Conversation

domenic
Copy link
Collaborator

@domenic domenic commented Feb 26, 2020

Closes #8.

/cc @annevk @wjmaclean

(I'll probably move this spec text into https://wicg.github.io/origin-policy/ soon, but for now it's here.)

I decided to check the session history entry's Document's origin, if the Document exists, instead of checking the session history entry's URL. As the "navigating a subframe" scenario illustrates, this allows better isolation when bfcache eviction happens. I'm open to being convinced this was the wrong tradeoff, but I thought since there were already some racy scenarios, it probably was OK.

@annevk
Copy link

annevk commented Mar 2, 2020

Given that Chrome tries to aggressively replace top-level browsing contexts upon navigation (whenever there are no popups, I'm told), which will also result in new browsing context groups, is it really that bad to track origins for that browsing context group in a deterministic manner?

@domenic
Copy link
Collaborator Author

domenic commented Mar 2, 2020

In what way would that be observably different from this one? Is it only the bfcache issue? If so we can consider switching to URL instead of Document.

@annevk
Copy link

annevk commented Mar 3, 2020

So I think this one defines a cache across different browsing context groups, which I'm not sure is what we need?

In that

  • Each tab is a browsing session
  • A browsing session holds history and a top-level browsing context
  • A top-level browsing context belongs to a browsing context group
  • Mostly when navigating the top-level browsing context gets replaced, but history does not

@domenic domenic changed the title Update "should site-key anyway" checks to use session history Update "should site-key anyway" checks to track the BCG May 1, 2020
@domenic
Copy link
Collaborator Author

domenic commented May 1, 2020

Alright, I've updated this to use a per-BCG list. The spec details are a little fuzzy, but that's just because I want to work on a proper HTML Standard pull request next week or so. I'd love review of the overall concept, and especially the examples.

@domenic domenic mentioned this pull request May 1, 2020
15 tasks
README.md Outdated

Both of these are consequences of a desire to ensure that same-origin sites do not end up isolated from each other.
Both of these are consequences of a desire to ensure that same-origin sites do not end up isolated from each other, even in scenarios involving navigating through the session history.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems tricky as this can involve multiple BCGs. whatwg/html#5350 will be relevant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence was meant to be more of a non-normative summary, and not a strict definition, but I can see how it's confusing in that way.

scenarios.md Outdated
* Subframe 1: `https://e.org` (with `https://x.e.com` in the session history)
* Subframe 2: `https://x.e.com` w/ OI

In this case, subframe 2 ends up keyed by `Site{https://e.com}`. This ensures that if the user navigates subframe 1 back, restoring the `Site{https://e.com}`-keyed instance of `https://x.e.com` in subframe 1, that subframe 1 and subframe 2 are still in the same `Site{https://e.com}` agent cluster, i.e. we have avoided isolating same-origin pages from each other.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's no bfcache for subframes, perhaps this isn't needed? In part this logic kinda depends on a solid understanding and formalization of bfcache and BCG replacement, but I might be overlooking something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is no bfcache, but there is the HTTP cache.

Consider if a user clicked the back button in such a state, where subframe 2 was in the Origin{https://e.com} agent cluster. To avoid putting same-origin pages into separate agent clusters, we'd then need to put subframe 1's copy of https://x.e.com into the Origin{https://e.com} agent cluster---despite the headers for that load not requesting origin isolation.

Maybe that's OK, you might think. After all, sometimes we will origin-isolate things when their headers don't request it. (For example, in some of the simpler three-document scenarios above you can see this.) But, we do that because there was a previously-seen origin-isolated instance. Doing it in this case would be weird; we're effectively taking our cues from the second time we saw https://x.e.com, not the first.

So the intuition here is basically that the first time we see an origin in a BCG, that should govern things going forward.

scenarios.md Outdated

Will the subframe get site-keyed, or origin-keyed?

The answer is origin-keyed. The `window.savedFrame` variable does mean that the agent cluster map still contains an entry with key `Site{https://e.org}`, which itself contains the saved realm and corresponding `Window` object. However, because the iframe was removed from the DOM, the `Window` has no browsing context, and thus the browsing context group does not contain any browsing contexts that have seen the `https://e.org` origin. Thus the request for origin isolation is respected, and we end up with `Origin{https://example.org/}` as the key.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if the parent keeps the agent alive wouldn't that keep the agent cluster alive and so if you go find your agent cluster why would it create a new one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you're right... Although I'm not sure if we're using the same reasoning.

I now think basically each BCG should have an append-only list of historical agent cluster keys. So, removing a BC should not change this list.

I'll reformulate the PR in these terms...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's what I proposed originally and you all didn't like that 😊

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry I wasn't clear; I meant the last few commits to align on your proposed model, since you kept poking holes in ours :)

@domenic domenic merged commit f2bbb4f into master May 6, 2020
@domenic domenic deleted the session-history-based branch May 6, 2020 18:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Current algorithm might be GC dependent
2 participants