Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken foreground detection #752

Closed
jan-ivar opened this issue Nov 22, 2020 · 10 comments · Fixed by #912
Closed

Broken foreground detection #752

jan-ivar opened this issue Nov 22, 2020 · 10 comments · Fixed by #912
Assignees

Comments

@jan-ivar
Copy link
Member

This spec references "focus" in 8 places, e.g.: "The User Agent MUST wait to proceed to the next step until the relevant settings object's responsible document is fully active and has focus."

All are meant to ensure camera & microphone cannot be turned on from background tabs, but it doesn't work:

Other specs are in the same boat.

We need to fix whatwg/html#5049 and use the following algorithm instead of "and has focus":

@jan-ivar jan-ivar self-assigned this Nov 22, 2020
@jan-ivar
Copy link
Member Author

jan-ivar commented Dec 9, 2020

This may not cut it either, as 3 out of 4 browsers return false while the user is in the URL bar, which shouldn't delay camera IMHO.

Worse, only 1 (Firefox) out of 4 browsers appears to care about focus at all: https://jan-ivar.github.io/dummy/gum_visiblefocus.html

Several specs seem in need of a similar "visible and focused" step in HTML, but it may need to be a new one.

@jan-ivar
Copy link
Member Author

It's probably a stretch to call this editorial, since behavior varies in implementations.

@jan-ivar
Copy link
Member Author

jan-ivar commented Sep 17, 2021

@eladalon1983 wrote in w3c/mediacapture-screen-share#192 something I think is relevant to getUserMedia:

If the tab is visible but unfocused (for example, two browser windows visible on the screen side by side), this would produce the difference of not invoking a prompt on the unfocused browser window+tab (until focused).
... Are we sure this is really preferable?

It's overly strict in that particular case, which comes up more for getUserMedia than getDisplayMedia which requires user activation (and thus focus).

This spec mandates (keyboard) focus ahead of prompting, when it might suffice that the requesting document's tab is the foreground tab in that window.

When I tested this, Safari appeared to have a good solution that technically violates the spec: it prompts if the requesting document's tab is the foreground tab, regardless of focus. Its prompt is also clearly associated with the document.

The spec should probably allow this. This suggests two tests: A "foreground" visibility test ahead of prompting, and a "foreground" + focused test before resolving, to preserve the no-prompt case.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Oct 12, 2021
…romises r=jib,nika

w3c/mediacapture-main#574

Focus on browser chrome widgets is accepted provided the tab is fully active
and foreground.
w3c/mediacapture-main#752 (comment)

Differential Revision: https://phabricator.services.mozilla.com/D127051
hsivonen pushed a commit to hsivonen/gecko that referenced this issue Oct 12, 2021
…romises r=jib,nika

w3c/mediacapture-main#574

Focus on browser chrome widgets is accepted provided the tab is fully active
and foreground.
w3c/mediacapture-main#752 (comment)

Differential Revision: https://phabricator.services.mozilla.com/D127051
@youennf
Copy link
Contributor

youennf commented Jan 10, 2022

  • We never intended to require iframe focus (we want the top-level browsing context instead)

I guess this is the same for screen sharing. I filed w3c/mediacapture-screen-share#203

@alvestrand
Copy link
Contributor

Pinging @palak8669

@jan-ivar
Copy link
Member Author

jan-ivar commented Oct 27, 2022

This suggests two tests: A "foreground" visibility test ahead of prompting, and a "foreground" + focused test before resolving, to preserve the no-prompt case.

I think this makes sense for getUserMedia. On desktop, it seems comforting to know that other browser windows that weren't using camera or microphone when I've left them open cannot decide to turn on camera or microphone on a whim while I'm not interacting with them.

But what would this mean for enumerateDevices? Right now, Firefox's check in enumerateDevices is:

    if (!bc->IsActive() ||  // background tab or browser window fully obscured
        !bc->GetIsActiveBrowserWindow()) {  // browser window without focus

IOW, the same page visibility AND focus of the user agent window (not the document) check.

While a focus requirement seems defensible for getUserMedia, perhaps the visibility requirement alone is sufficient for enumerateDevices? There it's anti-fingerprint, not anti-spying. @karlt @youennf @martinthomson @jesup Thoughts?

@karlt
Copy link
Contributor

karlt commented Oct 31, 2022

What a operating-system-window focus requirement provides enumerateDevices() is that there would (usually) be only one focused browsing context hierarchy on a system.
With only a visibility test, there would often be more than one "visible" top-level browsing context on a user's desktop, allowing fingerprinting across origins. Visibility is typically not strict, and so a browsing context is typically considered visible even when fully occluded by another system window, and some desktop systems do not promote minimization of inactive windows.

The disadvantage of the focus requirement is that sometimes the presence of a device is useful for displaying items that would be visible before any user interaction.

Visibility seems the preferred requirement for enumerateDevices() and "devicechange" if fingerprinting exposure can be comparable to focus.
For example, if delaying the exposure of device changes by returning an old list of devices for a long enough unpredictable period of time would reduce the correlation between origins sufficiently, then the list of devices would at least be available and accurate when the devices haven't changed recently.

@martinthomson
Copy link
Member

Does the platform already expose whether the current window has focus? I assume that it does, but want to confirm.

Other than that, I think Karl’s argument resonates with me. The fingerprinting risk exists if the information is released under any focus condition, so we are really looking at what makes the API useful.

@karlt
Copy link
Contributor

karlt commented Oct 31, 2022

Operating system window focus is exposed through "focus" and "blur" events iff the user-agent is directing keyboard events to the navigable. If the user-agent is taking keyboard events for its own widgets, then these "focus" and "blur" events are not dispatched.

Gecko exposes window focus, even when the user-agent is directing keyboard events to its own widgets, through :-moz-window-inactive, but I'm not aware of any standardized APIs already doing this.

@jan-ivar
Copy link
Member Author

jan-ivar commented Nov 3, 2022

Does the platform already expose whether the current window has focus?

If you're in an iframe without focus, then there's no way to tell whether the current window has focus or not AFAIK.

...if fingerprinting exposure can be comparable to focus. ... For example, if delaying the exposure ...

Agreed. I see no time limit where the spec says: "User Agents MAY add fuzzing on the timing of events to avoid cross-origin activity correlation".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants