-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network Information API reboot #569
Comments
I agree it would be nice to have "sustained speed" but actually measuring this is very difficult in any real network, for several reasons:
It's for these reasons (and others) that real-world protocols like TCP and QUIC constantly adapt to what is apparently the available capacity by measuring packet loss and latency. That's a superior approach to having the client just claim some poorly-defined value. Tagging @ianswett and @DavidSchinazi for awareness. |
Thanks for the reply, @ekr. Note that the objective is not to provide an accurate “speed test” (sites can do that themselves if they absolutely need to), but more to provide a rough house number of the recently observed speed. To make this clearer, one of the use cases is to replace background videos with poster images. Should the API tell the developer that recently the connection allowed video (e.g., by being, say, in the 25 Mbit/s bucket), the site could show a background video and not just a poster image. The video codec will then take care of adaptive streaming based on the actually observed speed in real time. |
Yes, but for the reasons I indicate above, this is simply not going to be accurate. If you look at the example I provided, which actually is a speed test, there is nearly an order of magnitude difference depending only on the size of the file (due to slow start, presumably).
Yes, I understand why you might want this functionality, but that doesn't make it any more technically feasible. In particular, this is very likely to chronically underestimate (because most clients don't use the entire channel all the time) and therefore will cause the server to provide a less rich experience than it otherwise could. It's quite likely that the server would be better off just directly measuring the download time for its own content (especially if it's using QUIC and can introspect into the connection state). |
In my experience, systems built on "did the network recently have property X?" never work well. They always underperform compared to a system that tries and measures. Based on this, I'm not sure this new API adds value. When you add in the privacy implications, the value might become negative. |
Thanks for the additional replies, @ekr and @DavidSchinazi! One way to read my proposal would be that it moves the closed-ended, label-based Judging from the ChromeStatus usage numbers that show that the current API is encountered on >40%(!) of all page loads, the way Regarding the privacy implications, the proposal actually reduces the fingerprintable surface compared to the current API. Note that the specifically negatively highlighted |
I think we're starting to repeat ourselves here.
In any case, I'm skeptical that this provides accurate data for the reasons I indicated. If you believe it does, then I think it's your responsibility to demonstrate that via some measurements. Some possibilities here would be:
|
There is a fingerprinting vector to the current API for sure, albeit there are cross-origin tracking mitigations in place. When it comes to non-tracking, non-analytics use cases (apart from what's listed in the comment above), the popular open-source project Shaka Player uses the API to adjust the initial playback rate. Developers of the social networking site Facebook have gone on the record to state that through this API is how they realize their adaptive loading use case.
@tarunban has offered to look into what data we could provide. We might be able to provide UMA data on the RTT distribution on WiFi networks for example. |
(Just to add: the analytics use case and the adaptive loading use case go hand-in-hand, as outlined in this brilliant post by the search company Algolia. I just verified that the described logic is actually in use for supporting browsers on hn.algolia.com.) |
Thanks for agreeing to try to get some data. We'll await that. |
I think the data I wanted to share was mostly around variability that we observe from Chrome in the quality of different networks even though all of them have the same last hop of WiFi. We measured the TCP RTT (Time taken to successfully establish a TCP connection to an individual endpoint) across users to different endpoints, and here is the percentile distribution in milliseconds: This data is fairly intutive but it's just meant to show that the type of the network's last hop is not sufficient enough to determine quality of the network. |
I don't think that this tells us much. I certainly agree that the last hop is not sufficient to determine the quality of the network, but I don't think that that addresses the point that this API is likely to give highly unreliable information. Incidentally, RTT doesn't necessarily tell you very much about available bandwidth, especially if you are dealing with people with high bandwidth-delay product networks (as can occur in, for instance, Australia). The most natural experiment here is to measure:
And then look at the correlation between them. |
Thanks, @ekr. @tarunban, do you think we can provide anything along these lines, maybe based on an un-capped at 4G but open-ended network quality estimator variant? |
In theory, I agree with this. But in practice, they're quite well correlated.
|
I should have mentioned one more thing: is this the distribution of individual measurements or the distribution of means or something? Because you'd expect quite a bit of distribution of individual measurements even from a single device, due to the paths to different locations, occasional packet loss, etc. |
I think it's a bit challenging to run a direct speed test in the browser because they generally use too much data (problematic on metered connections) and also by definition, they saturate the network. This means running the speed test will likely slow down tasks that user is trying to achieve. Instead of bandwidth speed test, we could run RTT speed test which use less data and does not saturate the network. That's feasible but that's already very close to what the Chromium open-ended network quality estimator implementation does: It observes the RTT to different end-points, takes a weighted average (higher weight to more recent samples) and returns the weighted average value. |
I'm not suggesting that you do this generally. I'm suggesting that you run an experiment which measures both maximum attainable speed and the metric you are proposing (maximum recent consumed bandwidth) and demonstrate that they are correlated. This could be run on a relatively small fraction of the users a single time and then you'd have the data.
As I observed previously, RTT and bandwidth are different quantities (though, as Ian suggests, they are in practice not unrelated). However, to the extent to which you think RTT is a proxy for effective path bandwidth, then that suggests this API is unnecessary: the server can measure the RTT on the connection directly by measuring connection establishment time (this is a bit of a pain in TCP but can be done by modifying the kernel, and is straightforward in QUIC) and use that to estimate bandwidth, without any information from the client at all. This measurement of course suffers from some initial noise, but has several advantages (1) it measures the performance of this path, not of random other paths the client may be using and (2) it measures the current conditions, rather than past conditions. Taking a step back: the question at hand here is whether the client is in possession of better information about its network environment than the server can measure directly. What I'm asking you to do is to provide a set of measurements that indicate that that's the case. |
I've seen papers which demonstrate this correlation, but I can't seem to find any right now. For Reno/Cubic/etc style TCP, the relationship is a direct result of the congestion controller(aka the Mathis equation), for example see: https://netbeez.net/blog/packet-loss-round-trip-time-tcp/ I'm not sure anyone has done a study of this for BBR, but if we're primarily interested in server to client bandwidth, then that should be an analysis I can do using existing server data. Client to server data is also possible, but we don't have nearly as many data points. One caveat for TCP is that if there's a PEP in the way, RTT could be very small(ie: a few ms) but the actual RTT is very large(ie: Satellite). I might be able to break down by a few different RTTs(ie: SynAck RTT, MinRTT, STT) if they're available. For QUIC, MinRTT and SRTT should be sufficient. |
In the interest of moving forward with this issue I will add a few thoughts:
|
That's good to hear!
It's meant to be useful enough to tell you what speed the browser has observed sustainably over the last of a number of sliding windows. Think of it similar to OS widgets that tell you the overall network speed. In user research linked in the proposal, this was considered useful.
That's a good point. I was mostly thinking of reporting the downlink speed, since this is what matters more in the majority of cases.
This was added to the spec so browser vendors could still correctly implement it, but at the same time not expose the actual data.
These were some of the use cases in mind indeed.
On the other side, this removes the information from the old API, so overall there'd be less data.
This sounds perfectly reasonable to me.
This would help with the very first request, so the server can from the start tailor the experience. This is especially useful for high-traffic sites.
The main idea is that this bit would change frequently enough. Just as an example: on a plane you'd be metered as you only got the 100MB WiFi pack, when you enter the terminal you'd be unmetered as you're connected to the airport WiFi, and then on the train into town again you'd be metered, as you're roaming. The setting doesn't really correlate with a trackable location. |
Gecko and WebKit don't implement the old API, so this would still be problematic 🙂
It is my understanding that the Server first has to respond with |
Hah, that's fair 👍.
The secret is |
Hey Mozilla folks,
I have rebooted the Network Information API recently. This is all in a relatively early stage, but I thought now would be a good time to get your feedback on the proposal:
Here is the short version:
Each of the attributes is accompanied by a client hint header that reflects the attribute:
Thanks in advance for your thoughts, here, or in the motivational document.
Cheers,
Tom
The text was updated successfully, but these errors were encountered: