Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Session history indexing is broken #2985

Closed
bzbarsky opened this issue Aug 30, 2017 · 14 comments
Closed

Session history indexing is broken #2985

bzbarsky opened this issue Aug 30, 2017 · 14 comments
Assignees

Comments

@bzbarsky
Copy link
Contributor

https://html.spec.whatwg.org/multipage/history.html#history defines things as following:

  • Each browsing context has a session history, consisting of session history entries.
  • There is a joint session history which combines the session history of all the "current" (contained in a fully active Document) browsing contexts in a browsing context tree.
  • Entries in the joint session history have indices that are chronological in terms of when they were added to their respective session histories.

One problem with this is that the spec doesn't define whether entry indices are static or dynamic. That is, does removing entries from the joint session history affect the indexing of entries that had larger indices?

As a concrete example, say we have a page X, with child browsing contexts A and B which load documents Y and Z respectively. Y has child browsing context C. C initially loads document W. The following steps happen:

  1. The page X is loaded.
  2. A navigation is performed in C to document W'
  3. A navigation performed in B to document Z'.
  4. A navigation is performed in A, to a document Y', which has no subframes.

After step 1, I believe the only entry in the joint session history is the one for the toplevel page (but am not sure about this; it's not clear what the "most recently added" thing is in this case). Its index is presumably 0.

After step 2, there are two entries in the joint session history: The one for W and the one for W'. Their indices are either [0, 1] (dynamic) or [1, 2] (static).

After step 3, there are three entries in the joint session history: W, Z, Z'. Their indices are either [0, 1, 2] (dynamic) or [1, 3, 4] (static).

After step 4, Y is no longer fully active, so the session history of C no longer participates in the joint session history. The joint session history now contains Z, Y', with indices either [0, 1] (dynamic) or [3, 5] (static).

So far so unclear, but now we have problems with both indexing versions. In the dynamic version, the entry corresponding to "right before step 3" has its index change from 1 to 0. Since .index is exposed, this seems pretty suboptimal. It certainly makes it impossible to actually use the exposed .index for anything sanely.

In the static version, history.back() is broken. That's because it tries to traverse the history by a delta of -1, and in step 1 of https://html.spec.whatwg.org/multipage/history.html#traverse-the-history-by-a-delta the index of the current entry is 5, 5-1 is 4, the number of items is 2, so the traverse steps are aborted.

@annevk
Copy link
Member

annevk commented Aug 31, 2017

I suspect this in part is a duplicate of #1454, but not marking it as such yet since this contains a lot of information too.

@bzbarsky
Copy link
Contributor Author

It's more than that. It's a objection to exposing .index, which is relevant, because Google is trying to ship that right now. I don't see how .index can be sanely exposed or used.

@domenic
Copy link
Member

domenic commented Aug 31, 2017

Sure, but the correct fix is to fix the spec per #1454, from what I understand.

@bzbarsky
Copy link
Contributor Author

bzbarsky commented Aug 31, 2017

Does that solve the indexing problem, though? Navigating (or worse, removing) subframes can still remove chunks from the session history, even with the #1454 approach. That means either gaps in the indices or reindexing.

In practice, Gecko reindexes, which makes it impossible to implement a sane .index. I can't speak for other implementations.

(There are other things that can lead to removals of stuff from session history, starting with expiration of too-old session history entries. Right now I think the spec requires that they all be kept, but browsers tend to ignore "leak all the memory" spec requirements like that in practice, and I wish specs would stop having them.)

domenic added a commit that referenced this issue Feb 7, 2018
This reverts f8e293f. Although there
was multi-implementer interest at the time, implementations have not
materialized, and Gecko has objected in #2985 that they are not able to
implement in their architecture.
@domenic
Copy link
Member

domenic commented Feb 7, 2018

I've posted #3460 for removing history.index from the spec for now.

domenic added a commit that referenced this issue Feb 16, 2018
This reverts f8e293f. Although there
was multi-implementer interest at the time, implementations have not
materialized, and Gecko has objected in #2985 that they are not able to
implement in their architecture.
@annevk
Copy link
Member

annevk commented Nov 23, 2018

As that's merged, closing this.

@annevk annevk closed this as completed Nov 23, 2018
alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
This reverts f8e293f. Although there
was multi-implementer interest at the time, implementations have not
materialized, and Gecko has objected in whatwg#2985 that they are not able to
implement in their architecture.
@jakearchibald
Copy link
Contributor

@bzbarsky just curious:

In practice, Gecko reindexes, which makes it impossible to implement a sane .index.

Why's it ok to update history.length when joint history changes, but bad to update something like history.index? Or is the problem in the implementation rather than the developer-facing API?

@annevk
Copy link
Member

annevk commented Jul 21, 2020

history.length is also problematic: #2018. We have considered if we could make it expose a lot less information, but haven't done the work toward that yet.

@bzbarsky
Copy link
Contributor Author

@jakearchibald if history.index can just change arbitrarily, including during the execution of a script (due to subframe removal), how would a consumer use it in practice? What would it get used for?

@jakearchibald
Copy link
Contributor

@bzbarsky sure, but you can say the same for history.length right?

@annevk
Copy link
Member

annevk commented Jul 22, 2020

Yeah, and again, I think we should make that less reliable. It's not like the history API has won any design awards. It's something that at some point got exposed and sites rely on to some extent, but it's also not fully interoperable or necessarily sensible given all the cross-origin leaks.

@jakearchibald
Copy link
Contributor

Cool. It just seemed from this thread that history.length was somehow less of an issue, but in fact it's just as much of an issue, but we can't remove it.

@bzbarsky
Copy link
Contributor Author

Exactly. In practice, the only uses we've seen for history.length are testing whether it's 1 or not (as in, is it possible to go back from this page), and even those are often pretty suspect...

@jakearchibald
Copy link
Contributor

Gotcha. Thanks for clarifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants