Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is still no complete replacement for LAN plaintext connections #23

Open
rcombs opened this issue Sep 7, 2020 · 69 comments
Open

Comments

@rcombs
Copy link

rcombs commented Sep 7, 2020

I've just been informed about this spec proposal, and I noticed that you're citing Plex as a mechanism for doing TLS on LAN servers that public sites communicate with.

I'm an engineer at Plex working on the media server, and it seems like you're missing a key caveat that our current implementation runs into, which in my opinion need to be addressed before this feature can land in browsers in its current state, or else our case will break for a number of users.

I've explained the problem in this Chromium issue; here's a quick summary:

CORS-RFC1918 isn't the first attempted solution to DNS rebinding attacks. A number of consumer and commercial routers have a feature (often on by default, sometimes non-user-configurable) that blocks DNS responses that give A records pointing to LAN addresses. This completely breaks our HTTPS setup, and forces us to downgrade to plaintext connections to raw IP addresses, which (thanks to mixed-content blocking) also forces us to load the web app itself over plaintext HTTP.

This situation is extremely regrettable as-is, but with the current CORS-RFC1918 proposal this case would break completely, meaning users in that situation wouldn't be able to access their LAN servers via the public web app at all. Specifically, this change would prevent any plaintext-HTTP loads of LAN resources from public origins.

This wouldn't be an issue for us if we didn't need to make plaintext requests in the first place, and I'd much prefer to get to a world where we no longer have to rather than coming up with some exception to allow them. As detailed in the linked Chromium issue, we handle this in most non-browser clients (all those on platforms where we have full control over the TLS and network stacks) by skipping DNS resolution entirely. That Chromium issue asks for a way to do the same in-browser. This or some other mechanism to allow TLS connections to LAN hosts with known IP addresses without public DNS is necessary for our product to continue working on affected networks if plaintext LAN requests from public hosts are blocked.

I've wondered whether this might be solvable using multicast DNS as well, though I'm not sure if that really provides any advantage over just letting JS short-circuit name resolution.

@mikewest
Copy link
Member

mikewest commented Sep 8, 2020

Thanks for the feedback! (cc @letitz)

The current proposal really doesn't want to support unauthenticated connections to intranets from the internet. That's pretty explicitly something that the secure context requirement aims to break by making TLS a requirement for any server that wishes to expose itself outside of its network via the magic of mixed content.

The challenge you're faced with is indeed frustrating. DNS servers that block resolution to local IP addresses are news to me, and do complicate things for Plex. I'm not sure that the answer is giving you control over DNS resolution, however. There are a number of challenges there, both from a technical and policy perspective (consider an administrator using DNS to push folks to "safe mode" searches, a la safe.duckduckgo.com, and so on). The application layer doesn't seem like the right place to do resolution.

As I mentioned on the bug, DNS resolution doesn't actually seem like what you want. If you know the hostname, you know the IP address, and you really just want an authenticated connection to that address... Both RTC and my least-favourite part of Web Transport seem relevant here. It seems worthwhile to consider how we can make those communication mechanisms compatible with this proposal.

+@yutakahirano and @vasilvv might have ideas around doing that for Web Transport.
+@henbos might know the right folks to talk to for WebRTC.

@letitz
Copy link
Collaborator

letitz commented Sep 8, 2020

It would be nice to have a well-lit path towards making HTTPS requests to localhost and private IP addresses that does not involve faking DNS results and/or self-signed certificates.

I wonder if regular fetches could be annotated with server certificate fingerprints like what Web Transport proposes (this is sure to please @mikewest 😉)? This might be very similar to telling the browser "connect to that IP, treat it as domain name foo".

Speaking of self-signed certificates, could the Plex server fall back to using one of those? I guess if you're not actually navigating to the Plex server, and only making subresource requests, that would not work.

@rcombs
Copy link
Author

rcombs commented Sep 8, 2020

Re: @letitz:
We've never considered self-signed (or non-publicly-trusted-CA-signed) certs a serious option, either for subresource loads (where they don't work at all) or for loading the app (where they [correctly] present a massive privacy warning). Additionally, I assume you mean for connecting directly to an IP address? Even if we were able to load resources in that way, the TLS would be providing no authentication, which would defeat half the point of it.

Server fingerprint annotation would probably work with the exception of while a cert has freshly been renewed and the client hasn't yet retrieved a fingerprint for the new cert (assuming we're talking about certificate fingerprints and not public-key fingerprints).

Solutions that apply only to Fetch would work for most of our use-cases, but still leave out media element loads (which are a major component for us) unless the same extension was provided for those as well.

Re: @mikewest, looks like you've seen my comment here re: RTC/WebTransport; I'll move over to this GH thread for any further discussion.

You asked over on the Chromium tracker for a design doc; we don't have any publicly-available summary of our whole setup (not because any of it's secret, but because we just haven't really had a need to write it all up before). I'll go ahead and explain the whole case here:

  • Users run the Plex Media Server (PMS) software on their personal computers, NAS devices, home servers, etc.
  • PMS is a C++ application that runs an HTTP/HTTPS server
  • Users point PMS at media files on disk; PMS stores metadata about them
  • When a PMS instance is signed in to an account, it connects to our cloud service and requests a certificate from our cert issuance service
  • Plex's cloud service retrieves a certificate for that PMS instance from our CA partner (see this third-party write-up for details; it also notes the issue caused by overreaching DNS rebinding protection features)
    • The most important thing to note here is that the certs are for the domain *.[random unique-per-cert hex identifier string].plex.direct, and that clients connect to [dashed-quad-IP].[ID].plex.direct, which our plex.direct DNS service resolves to the corresponding IP address (IPv6 also works similarly).
  • Certs are renewed when they approach expiration
  • When possible (not behind double/CG NAT, router configured to allow it), PMS opens a port-forward on the router via NAT-PMP/UPnP to make itself visible on the public internet.
  • PMS sends its known LAN IP addresses (and, if applicable, router-mapped port) to Plex's cloud service, which keeps track of them (along with the observed WAN address)
  • Plex client apps retrieve information about the servers their user can access from Plex's cloud service (names, IP addresses, DNS names, TCP ports, auth tokens, etc.)
  • Client apps (Web, iOS, Android, Roku, macOS, Windows, etc) connect to PMS over HTTPS
    • They'll try to hit the server's LAN and WAN addresses in parallel, preferring the LAN address if connecting to it succeeds, and otherwise using the WAN address
  • Clients hit an XML/JSON API to browse the media library, and also download images (photo media, or posters and cover art), and stream audio/video content (which can be direct media element loads of static files in the format the user provides, or live-converted by the server to the client platform's preferred streaming protocol and codecs)

This deployment has served as an example for a few others like it. Western Digital now uses a similar approach, as does UnRAID (though I'm not sure if either connects directly to LAN addresses), to name a couple. Now that free Let's Encrypt certs are available (including wildcards), I think of this as a viable model for any application that has a central service that can exchange IDs (in the form of DNS names) and wants to allow secure communication between clients and servers on a LAN, with the exception of browser cases on networks with these hostile DNS responders, and loss-of-WAN scenarios (which we handle in other apps, but not in browsers).

The intersection of "no public address we can map ports on (or at least, no working hairpin NAT on the router)" and "router blocks DNS responses pointing at LAN" is surprisingly substantial and has been by far the largest headache in this deployment for years (though even if that case didn't exist, we'd have to implement most of the same workarounds in order to support loss-of-WAN cases).

@mikewest
Copy link
Member

mikewest commented Sep 9, 2020

Thanks for the detail, @rcombs! I understand that the cert itself isn't really the issue, but the potentially hostile posture of the user's DNS provider. As I noted above, however, what's "hostile" from your perspective is potentially "reasonably defensive" from other perspectives. Administrators use DNS to manage their network, and it's not at all clear to me that allowing web origins the ability to bypass DNS resolution is a great idea.

Regarding the flow you laid out above: one thing I'm not clear on is the relationship between the client app and the server. The server lives inside someone's local network: where does the web-based client app live? I'm asking because there's currently (see #1, however) no restriction on private->private communication, nor public->public. Do users connect to a centralized cloud service in order to view media from their local network? Or do they host a local version of the client app on their local server?

Solutions that apply only to Fetch would work for most of our use-cases, but still leave out media element loads (which are a major component for us) unless the same extension was provided for those as well.

I'm imagining here that it could be possible to take the stream resulting from something like Web Transport's datagram API, and feed it into the <video> element's srcObject for rendering. That would require someone to do some work to convert the ReadableStream into some kind of MediaStream (and I don't know those APIs nearly well-enough to know if that's possible from userland; I'll defer to someone like @henbos).

@rcombs
Copy link
Author

rcombs commented Sep 9, 2020

As I noted above, however, what's "hostile" from your perspective is potentially "reasonably defensive" from other perspectives.

Is this an issue if you're only allowed to override-resolve (or whatever you want to call it) something to a LAN address? I've only ever seen those blocked to protect against DNS rebinding, which this spec solves more cleanly. Theoretically these same cases can be bypassed by WebRTC or WebTransport (it's "just" massively more complicated), so it's hard to see how preventing regular HTTPS from working in those cases serves a network-security purpose.

Where does the web-based client app live?

It can be in a few places, depending on the user setup. The app users open in browsers is primarily loaded from plex.tv, which is convenient and secure. Some users also load the app from the media server itself (it hosts a copy), sometimes expecting to be able to communicate with other media servers on the same LAN from there. It looks like the latter case would continue working in the affected cases (but only in plaintext, which is undesirable).

The other case is the TV app, which can be loaded from plex.tv, but is usually shipped as an installed app bundle on smart TVs and set-top boxes; I think that means the client address space is "local" and the context is "secure", so this spec probably doesn't affect it?

There's currently (see #1, however) no restriction on private->private communication, nor public->public.

Ah, the distinction between "private" and "local" is easy to trip over here. So that case would continue to work, but be insecure despite everything involved being technically capable of communicating securely, if only the client knew how to talk to the server.

So, there are some cases that would continue to work with these changes, but some that would break (most notably falling back on loading the browser app from plex.tv insecurely, which is an awful awful hack that I wish we didn't have to do), and several cases that are undesirable that would be great to have solutions for.

I'm imagining here that it could be possible to take the stream resulting from something like Web Transport's datagram API, and feed it into the <video> element's srcObject for rendering.

Keep in mind that media files aren't simply streamed continuously; it's very common for the player to have to seek to read different parts of the file during normal playback (particularly during startup), and of course whenever the user seeks within the media. That'd be quite a bit of additional functionality on top of Web Transport (the bulk of an HTTP/3 stack, really).

@mikewest
Copy link
Member

mikewest commented Sep 9, 2020

Is this an issue if you're only allowed to override-resolve (or whatever you want to call it) something to a LAN address? I've only ever seen those blocked to protect against DNS rebinding, which this spec solves more cleanly.

The examples I provided are certainly external in nature (school admins protecting their users from inappropriate results, etc). I don't know enough about the enterprise case to understand whether similar DNS-based partitioning for internal services is a thing. I suspect it might be (and that @sleevi and @ericorth will know more about this world than I do).

Theoretically these same cases can be bypassed by WebRTC or WebTransport (it's "just" massively more complicated), so it's hard to see how preventing regular HTTPS from working in those cases serves a network-security purpose.

  1. Bypassed iff both the client and server cooperate to establish a communication channel outside of DNS, yes. Presumably the services the administrator is pushing you away from wouldn't be incredibly enthusiastic about collaborating in unintended cases?

  2. Also, this is kinda why I don't like either RTC or WebTransport's layer-piercing attributes. There's quite a reasonable argument to be made that we shouldn't ship these capabilities in those APIs either. :) I mentioned that in https://groups.google.com/a/chromium.org/forum/#!msg/blink-dev/mHV_ZALf07Q/d7J9W0a1CQAJ.

Where does the web-based client app live?

It can be in a few places, depending on the user setup. The app users open in browsers is primarily loaded from plex.tv, which is convenient and secure.

Got it, thanks. This would indeed be affected by the proposal we're discussing here, insofar as http://plex.tv/ would no longer be able to reach into local networks.

Some users also load the app from the media server itself (it hosts a copy), sometimes expecting to be able to communicate with other media servers on the same LAN from there. It looks like the latter case would continue working in the affected cases (but only in plaintext, which is undesirable).

The proposal we're discussing here would not block this use case, either in secure or non-secure modes. You'd be in the same boat as the status quo.

The other case is the TV app, which can be loaded from plex.tv, but is usually shipped as an installed app bundle on smart TVs and set-top boxes; I think that means the client address space is "local" and the context is "secure", so this spec probably doesn't affect it?

Assuming that the set-top box is loading the app from itself (e.g. http://localhost/plex/app/ or similar), then it would be considered "local", and would be able to request "private" and "public" resources. http://localhost/ is also considered a "secure context".

So, there are some cases that would continue to work with these changes, but some that would break (most notably falling back on loading the browser app from plex.tv insecurely, which is an awful awful hack that I wish we didn't have to do), and several cases that are undesirable that would be great to have solutions for.

I agree. I'd like to find reasonable solutions here that maintain the guarantees we want to provide to users, and reasonable isolate their local networks from the web.

@letitz
Copy link
Collaborator

letitz commented Sep 9, 2020

Some users also load the app from the media server itself (it hosts a copy), sometimes expecting to be able to communicate with other media servers on the same LAN from there. It looks like the latter case would continue working in the affected cases (but only in plaintext, which is undesirable).

The proposal we're discussing here would not block this use case, either in secure or non-secure modes. You'd be in the same boat as the status quo.

Indeed, if the webapp is served from the media server on a private IP address, regardless of whether it is served securely or not, it can make requests to other servers with private IP addresses. Do note that if served insecurely, the webapp could not make requests to localhost - but it could make requests to the private IP corresponding to the UA host.

Suggestion: given the above, maybe the right move when PMS (or https://plex.tv, not sure which component is responsible for this) detects that it is in the unfortunate intersection set is for https://plex.tv to redirect the browser not to http://plex.tv but to http://<media server's host:port>?

As Mike points out, we are considering extending the spec to forbid all cross-origin requests initiated by insecure contexts to private/local IP addresses: see #1. If we did this, then the above suggestion would not work.

This makes me think that maybe we should waive the secure context requirement for http://<literal private/local IP address>:port origins? For an on-path attacker to impersonate such an origin, they would need to have breached the local network and used something like ARP cache poisoning, a substantially taller order than intercepting a request to http://example.org. The target would still have to respond OK to pre-flight requests.

As for generically solving the problem with authenticated connections to private network endpoints:

I just read through some of RFC 6762. I don't see how it would help the problem at hand, since IIUC CAs will not hand out certificates for .local domains? Seems to me the same fate would befall .home domains. Also did the .home TLD proposal progress any further than a draft?

@sleevi
Copy link
Contributor

sleevi commented Sep 9, 2020

This makes me think that maybe we should waive the secure context requirement for http://<literal private/local IP address>:port origins?

No, I don't think this is viable, and the least favorable of all options. The complexity calculation is inverted (it's easier, not harder), it's worse for users ("secure means secure, except when it doesn't"), and it further promulgates the notion of a security boundary that largely doesn't hold; that is, the local network is less secure, which is part of why this exists in the first place. This is the same reason extending fetch() to allow overriding DNS or TLS validation is also a non-starter.

To the problem at hand, as a Plex user and lover, I definitely want to make sure to understand the scenario. Having local devices block public DNS resolving to local names is, unfortunately, nothing new here, so I totally understand and appreciate the problem space. I also appreciate that you're not really a fan of solutions other vendors have used, such as using WebRTC (or the to-be-explored WebTransport), since like Mike, I'm also concerned about the security implications of that.

What's not obvious to me is whether the remaining problem statement, namely:

  • A public service on https://app.plex.tv
  • Wishing to talk to a local service on [IP]
  • That cannot use a public DNS name that points to a local IP (due to asinine DNS rebinding implementations)

Is that much different than what the Network Discovery API tried to solve. Which is to say, that it's not an easy problem, but also one with a lot of prior art in exploring the implications and tradeoffs involved.

@rcombs
Copy link
Author

rcombs commented Sep 9, 2020

reasonable isolate their local networks from the web

To be clear, I'm happy to have any solution require a very explicit opt-in from the server (I think exposing a CA-signed cert does a lot for this, but signaling via some sort of HTTP header or TLS extension or what-have-you would also be fine by me).

RFC 6762

Yeah, if we could get trusted certs for mDNS addresses that'd be excellent (at least in most cases; now and then we run into cases where multicast doesn't properly traverse a LAN…), but as far as I'm aware there aren't any plans for such a thing.

maybe the right move when PMS (or https://plex.tv, not sure which component is responsible for this) detects that it is in the unfortunate intersection set is for https://plex.tv to redirect the browser not to http://plex.tv but to http://<media server's host:port>?

I suppose that might be the only option available to us if nothing else is done. Authentication when the origin is http:// is pretty fraught with peril and we've been trying to move away from encouraging it, though.

This is the same reason extending fetch() to allow overriding DNS or TLS validation is also a non-starter.

The arguments against overriding addresses we've discussed have all been around network admin control, though; not user security. I don't think we've established any clear reason why overriding for LAN addresses in particular is unacceptable.

That cannot use a public DNS name that points to a local IP (due to asinine DNS rebinding implementations)

Well, asinine rebind-protection implementations are one case, but another is the no-WAN case, which can come up with a public-address origin in the case of a cached PWA, or with private-address cases regardless.

the Network Discovery API

We do have a multicast-based LAN discovery protocol, though it's been used less and less for years since cloud-based auth became available. Did that API ever allow for secure, authenticated communication?

I'd also like to emphasize that even for cases that currently are allowed, and would continue to be allowed with this spec, we currently have to use plaintext connections on LAN for no particularly good reason. I know a lot of people who don't care about this, and assume that LANs can be thought of as secure (and it's not easy to convince people that's not true); many argue that even requiring users to explicitly opt-in to HTTP fallback on LAN is an excessive UX burden. I've tried to argue for years that using TLS shouldn't be considered an excessive requirement, but as long as these cases exist, LAN TLS in browsers is always going to be flakier than plaintext, and we'll continue to have users and support staff complain about TLS requirements. This is where a lot of my frustration with the status quo comes from.

@letitz
Copy link
Collaborator

letitz commented Sep 10, 2020

@sleevi:

This makes me think that maybe we should waive the secure context requirement for http://<literal private/local IP address>:port origins?

No, I don't think this is viable, and the least favorable of all options. The complexity calculation is inverted (it's easier, not harder), it's worse for users ("secure means secure, except when it doesn't"), and it further promulgates the notion of a security boundary that largely doesn't hold; that is, the local network is less secure, which is part of why this exists in the first place. This is the same reason extending fetch() to allow overriding DNS or TLS validation is also a non-starter.

Could you explain why it is easier for an attacker to impersonate a local and/or private IP address, and more generally why the local network is less secure? Not disagreeing, just curious.

As for "secure means secure", except when it doesn't", this waiver would not result in the Chrome omnibox displaying a "secure" badge, it would only allow those websites to make requests to other private IPs upon a successful pre-flight request. I do not think users would notice, so I don't think it would confuse them.

@rcombs:

maybe the right move when PMS (or https://plex.tv, not sure which component is responsible for this) detects that it is in the unfortunate intersection set is for https://plex.tv to redirect the browser not to http://plex.tv but to http://<media server's host:port>?

I suppose that might be the only option available to us if nothing else is done. Authentication when the origin is http:// is pretty fraught with peril and we've been trying to move away from encouraging it, though.

Glad to hear that there is a workaround to your problem! That being said, I agree that being forced to use naked HTTP in 2020 is sad. In my eyes it seems that authenticating in cleartext to a box on the local network is less dangerous than sending those credentials in cleartext halfway across the world, but that may stem from my above misunderstanding of the security properties of the local network.

@rcombs
Copy link
Author

rcombs commented Sep 10, 2020

Cleartext on LAN is almost certainly safer than cleartext over the internet (any attacker that can intercept your LAN traffic can also almost always intercept your WAN traffic), but it's still an added risk that we shouldn't have to take today.

@letitz
Copy link
Collaborator

letitz commented Oct 19, 2020

Ok, so I've taken a deeper look at WebTransport, and I believe the idea @mikewest floated in #23 (comment) is possible in Chrome today.

One can wire a WebTransport stream to an HTMLMediaElement using MediaSource. There is even sample code 🥳

It would require some work to support seeking. It seems to me that a fairly simple protocol and server could handle that. Especially so since when loading media from the local network, throughput and latency should allow for a naive implementation to perform well.

@rcombs
Copy link
Author

rcombs commented Oct 19, 2020

So I'd need a QUIC server (a task unto itself, though one I already was intending to do eventually, but not immediately under time pressure), with a custom protocol on top of that? Or, I guess I could implement HTTP/3 on top of QUIC within JavaScript, and shim that into a wrapper around Fetch? And then I'd need to have a certificate valid for no more than 2 weeks (vs my current certs, which are CA-signed and valid for 3 to 12 months), meaning I'd have to locally generate self-signed certs and build out infrastructure to store and distribute fingerprints to clients.
Plus, I'd need rotate them weekly, generate each new cert well in advance of the current one's expiration with the notbefore/notafter timestamps front-dated with some overlap (1 week?), and make sure clients always accept fingerprints for at least 2 certs at any given time. And then if we wanted to support cases where users potentially have no internet connection for an extended period of time (we call this the "nuclear submarine case" after the real support request we've had about it), we'd have to either generate a long period's worth of certs well in advance, or give clients the ability to fetch the next valid fingerprint from the server over a connection using a current one… and advise users to make sure that every client connects at least once a week.

Like, there are parts of this that I'd be happy to do, but in practice I don't think this entire setup is realistic to ask all the relevant teams build to out, and certainly isn't going to be ready in a short period of time. What it would end up coming down to is just falling back to plaintext in all of these cases, which means giving users a button that ultimately says "make this problem go away", which they're going to click whether they're really in a case that needs it or not, and whether it's really safe or not. As long as I have to have that button available, there are going to be users who click it on a coffee shop wifi network and have their auth token immediately stolen over plaintext HTTP, and preventing these attacks was supposed to be the whole point of these requirements to begin with.

@letitz
Copy link
Collaborator

letitz commented Oct 30, 2020

I understand your position - it is certainly no small amount of work to work around the restriction in this way.

What about the aforementioned workaround of redirecting your users to http://<media server's IP> instead of http://plex.tv? Insecure local websites are not subject to any more restricitions under CORS-RFC1918 than they are now, and a locally-served page can embed content fetched from secure public websites.

@letitz
Copy link
Collaborator

letitz commented Dec 15, 2020

Hi @rcombs, any thoughts on my previous comment?

@rcombs
Copy link
Author

rcombs commented Dec 15, 2020

That'll probably have to be our solution if we don't come up with anything else, it's just also a really bad route to have to take, since it means giving people options (options that they really do need to take in supported cases, so they can't be behind too many layers of "here be dragons" warnings!) that send auth tokens over plaintext, even if "only" over LAN. It's not difficult to imagine situations where an attacker on a public network could arrange for the web app to believe it's in this situation and offer a fallback (or even situations where this happens completely legitimately while on a public network), which the user would, as we're all aware, most likely click through. So it's a "solution" in much the same way that early browsers allowing easy click-through on self-signed TLS certs was: sure, the user's immediate issue is solved in the legitimate case, but it also defeats the security measure to a substantial extent.

@letitz
Copy link
Collaborator

letitz commented Dec 17, 2020

I see. Compared to the status quo I view the change as security-positive. Whereas before the user would send authentication material in plaintext over the public internet, now that material is sent over the local network. This seems like a strict reduction in attack surface. Is the issue that the local server requires the user to enter username and password again, instead of accepting the cookie obtained from https://plex.tv?

I'm not sure I understand the scenario you refer to. A user is in a malicious coffee shop and has their media server with them, then connects to https://plex.tv. The coffee shop owner, i.e. the attacker, has configured their router to drop incoming DNS responses mapping to private IP addresses. The webapp determines it is in such a situation and falls back to http. The coffee shop owner then serves a fake media server phishing website at the target IP address (gleaned from the DNS response)?

@rcombs
Copy link
Author

rcombs commented Dec 17, 2020

There are a few tricky things here that overall probably make it a lateral move security-wise, making some things very marginally better and some things distinctly worse, while being unambiguously a UX downgrade.

We currently don't actually send any auth information over the public internet in plaintext in these cases; the only thing loaded in plaintext is the webpage itself (solely for the purpose of not triggering mixed-content restrictions, which here force us to be less secure!), while all actual private communications with the service are over HTTPS; auth data is stored in localStorage rather than cookies. This means we're protected against passive MITM on WAN, but not active WAN MITM that, say, injects modified code that uploads the contents of localStorage somewhere nefarious. Switching to pointing the user's browser directly at the LAN server they're trying to connect to would solve that particular attack vector.

However, it would also mean that the user would have to authenticate to the server. Previously this meant actually entering their Plex username and password, which we've moved away from because it meant encouraging users to enter their passwords on sites other than our own (and obviously wasn't password-manager-friendly). Now, we instead use a SSO mechanism, where the user clicks "sign in" on the server's copy of the web app, and it directs them to a page on plex.tv where they're prompted for whether they want to sign the server in question into their account. However, once again we're met with a UX challenge: when you're connecting to a server like this on LAN, this prompt can't be too alarming, nor too difficult to click through, since it happens during normal usage! So we're ultimately forced to make it easy for users to effectively give access to their account to any device with the same IP address they're connecting from, which puts us in yet another position of having to (whether we want to or not) train users to take an action that's potentially dangerous, because we can't distinguish the safe case from the attack case.

There's also a significant UX problem with this model: we'd have no way of knowing whether the server is actually accessible over plaintext HTTP before navigating the user's browser to it. This means that if the network situation turns out not to be exactly what we expected, we'd end up sending users to browser "This site can't be reached" messages, and have no way of knowing when that's occurred.

The coffee shop owner, i.e. the attacker, has configured their router to drop incoming DNS responses mapping to private IP addresses.

This can also be anyone else in the coffee shop, or even a device left there connecting to the network; you don't need to own the network to perform a basic ARP poisoning attack and take an active-MITM position on any other user's traffic.

The webapp determines it is in such a situation and falls back to http. The coffee shop owner then serves a fake media server phishing website at the target IP address (gleaned from the DNS response)?

The attacker in an active-MITM position can simply handle any SYN to any address on port 32400 (the TCP port the media server runs on) by connecting it to a malicious HTTP server. We have some mitigations that attempt to prevent the client from prompting to fallback on plaintext HTTP when it's not actually on the same network as the user's server (based on public IP address), but they can't be completely robust against these cases; apparent IP addresses are not an authenticator.

This doesn't even get into how easy MITM can be on home LANs (where fallback is intended under this model!), between open wifi networks, ones with default passwords, and easily-infected IoT devices. Honestly, I would consider a system that can be compromised by an attacker on the local network to be overall more concerning than one that can be compromised by an attacker at the ISP (though nearly all cases of the latter we're describing here are also cases of the former).

So yes, navigating the browser to the local server by IP address is a solution, but I maintain that it's a dangerous one that provides little if any overall security benefit over the status quo, worsens the user experience, and continues to be vulnerable to a variety of the kinds of attacks that browser security restrictions attempt to address in the first place, so a proper solution for these use-cases is still needed if our goal is to protect users from attackers.

@rcombs
Copy link
Author

rcombs commented Dec 17, 2020

I'd like to reiterate how the primary objection (at least, the primary one that still applies even in the most restricted designs) that I've seen to the solutions I've proposed to these problems have been around how they would allow a user or a website to bypass controls put in place deliberately by a network administrator… but all of the workarounds people have suggested (WebRTC, WebTransport, navigation by IP address) allow those exact same controls to be bypassed. Is there any reason why plain HTTPS, with its well-developed tooling and support in browser APIs and media stacks, can't have the same local-network capabilities that all of these other methods enjoy? If so, I can't think of it, and haven't seen it articulated by anyone else.

@sleevi
Copy link
Contributor

sleevi commented Dec 17, 2020

Is there any reason why plain HTTPS, with its well-developed tooling and support in browser APIs and media stacks, can't have the same local-network capabilities that all of these other methods enjoy? If so, I can't think of it, and haven't seen it articulated by anyone else.

The difference here, vs say WebTransport or WebRTC, is that those methods have explicit opt-in models for the target, which this proposal is actually aligning with, and unlike your plaintext example, the connections are encrypted and authenticated in a way that can prevent both the passive and the active MITM scenario (to some extent; there are other issues with those methods)

@rcombs
Copy link
Author

rcombs commented Dec 17, 2020

The difference here, vs say WebTransport or WebRTC, is that those methods have explicit opt-in models for the target

I suggested providing a TLS extension earlier; you could even piggyback it on ALPN or the like. Or it could even be an HTTP header much like the ones this proposal adds.

unlike your plaintext example, the connections are encrypted and authenticated in a way that can prevent both the passive and the active MITM scenario

To be clear, I don't want to use plaintext under any circumstances; I want to provide robust encryption and authentication to all users in all cases. I'm just currently left with no other practicable choice under these conditions.

@sleevi
Copy link
Contributor

sleevi commented Dec 17, 2020

I suggested providing a TLS extension earlier; you could even piggyback it on ALPN or the like

OK, so just to make sure here: you’re talking about the server fingerprint annotation approach, combined with the above, right?

If that’s the case, then WebRTC and WebTransport both have their own issues there, and separately being dealt with. The annotation approach rapidly defeats the authenticity/confidentiality guarantees (e.g. by allowing shared keys among all users), which can then defeat mixed content protections or other important protections. There’s ongoing work with those to try to address some of the security issues caused by allowing the application to decide how to auth the origin / bypass the origin authentication provided by the browser.

@rcombs
Copy link
Author

rcombs commented Dec 17, 2020

Hmmmm, so to go over what that would actually look like real quick:

  • Fetch request has a URL of https://[IP]:32400/[path]
  • App passes in the fingerprint of the certificate (or public key?) we're expecting to see at that address (which means we need to add some extra infrastructure to keep track of those, but that's very doable)
  • Possibly the request is made with a SNI-like TLS extension saying "I'm expecting to see a cert/key with fingerprint X" (otherwise rotation gets very dicey if this is cert-based fingerprinting instead of key-based)
  • The server must respond with a TLS extension indicating "Yes, I am aware that you're a web browser making a non-DNS-based HTTPS request, and I'm fine with this"

This should be doable. My main concern is around renewal; I see 3 possible solutions here:

  • Server keeps 2 valid certs available in rotation and presents the one corresponding to the SNI-ish mechanism I mentioned
  • Generating new certificates a substantial period before switching over to a new one, coupled with the client being able to specify multiple expected fingerprints rather than just one, and central infrastructure that tracks and distributes both at once
  • Fingerprinting based on the public key instead of the certificate (which would require long-lived public keys, which I don't think is especially desirable)

Given the choice, I'd pick the first option, as it'd be by far the easiest for me to implement. None of these options is quite as simple as telling the browser "I expect the target to have a valid cert for [domain] under the browser's trust system, and the server can affirm that it's okay with this DNS bypass by opting-in with a TLS extension", but they would all be massively better than anything available right now, and I'd be thrilled to implement any of them. (Plus, this would probably be easier to roll out for anyone who doesn't already have centralized issuance of browser-trusted TLS certs to all their devices.)

My only other concern is that this only provides support within Fetch, and not in the native media APIs (,

@rcombs
Copy link
Author

rcombs commented Dec 23, 2020

I've just realized that the rotation situation can be simplified a bit for implementers by actually using SNI as-is rather than a new extension that would require new infrastructure. For instance, 88cf03426602b5010fb0ad6963ef984ea382b906603cf0c6fac242c72bcfde20.sha256.print* (using the illegal print* TLD, as * is guaranteed to never appear in actual DNS usage, but can be conveyed by SNI just fine); alternately, a reserved TLD could be used. There is no need for the server to actually present a certificate with a CN or SAN that covers that domain (and it'd be impossible to produce), only one with the specified fingerprint.

@rcombs
Copy link
Author

rcombs commented Mar 18, 2021

Does anyone have any thoughts on the fingerprint-annotation Fetch extension outlined above? I'd really like to get this case addressed.

@sleevi
Copy link
Contributor

sleevi commented Mar 18, 2021

It’s one of the options we’d previously explored, but unfortunately ruled out. There are technical issues with your specific proposal (e.g. * is also arguably invalid in SNI as well), but the general idea of the fingerprint approach runs into the same issues I alluded to.

I don’t have much to add beyond that, unfortunately. It’s definitely an area of active interest and exploration.

@rcombs
Copy link
Author

rcombs commented Mar 18, 2021

The annotation approach rapidly defeats the authenticity/confidentiality guarantees (e.g. by allowing shared keys among all users), which can then defeat mixed content protections or other important protections.

If the alternative is forcing people to fall back on loading the entire application, JS included, over plaintext HTTP, is that really any better? At some point, trying to enforce rules against secure apps doing insecure things does more harm than good, when the only way to opt out of the restrictions is to completely abandon any semblance of security whatsoever. We're never going to web-policy our way out of people sometimes writing vulnerable apps, and right now all this is doing is preventing people who do want to write secure apps from having the tools required to do so.

@estark37
Copy link
Collaborator

@rcombs a quick clarification -- I'm having trouble understanding why a Fetch-with-opt-in-cert-fingerprint is workable for Plex but WebRTC/WebTransport certificate fingerprints aren't. I understand that the latter introduces substantially more engineering work for you but in a world of infinite time/resources would the two options be equivalent to you? (asking for the sake of understanding your requirements)

@rcombs
Copy link
Author

rcombs commented Mar 18, 2021

There'd likely be a few issues with WebRTC/WebTransport around embedded browsers not supporting them, possibly around local discovery for WebRTC, and neither has an obvious way to apply to the native media player infrastructure (which means all playback must be over MSE, which adds substantial latency and other limitations when playing on-disk content), but otherwise, yes, they're technically equivalent. So, theoretically those options might be workable (with limitations) in a world where I can successfully argue for infinite dev time on the project (which to be clear very few devs will do as long as "just load over HTTP" remains on the table)… Just, I haven't seen any convincing policy reason for those to be available, but a plug-and-play HTTPS API not to be.

@rcombs
Copy link
Author

rcombs commented Mar 28, 2021

Also worth pointing out, in re: the "could be used to get around mixed-content restrictions": half the contexts where this comes up are ones where those restrictions are disabled altogether. The current solution to this problem is to use plaintext HTTP connections, regardless of origin, because on embedded systems there is no web policy stopping anyone from doing that. You cannot prevent people in these situations from doing insecure things via web policy; you can only give them the tools required to build secure apps, and hope that they do so. If there were secure ways of handling these cases, maybe embedded browser engines would restrict mixed content, but that's not going to happen as long as these issues still exist.

@rcombs
Copy link
Author

rcombs commented Apr 7, 2021

Could you pass an auth token or do an oauth flow between the public web app and the media server?

We already do something along these lines, but an attacker could just as easily mimic a server instance and send the user through the SSO path, and we'd have very little way to tell that anything's wrong (we do some alerting when the IP address differs or in other suspicious-looking cases, but that isn't foolproof).

Do you mean e.g. an Android WebView in an Android app? I'm not sure that embedded browsers like that are realistically going to implement any of the things that we're talking about here.

I'm more referring to platforms where either all apps are essentially web pages (possibly with thin wrappers around them), or where that's the only way to deploy a cross-platform app. Smart TVs, game consoles, set-top boxes, Chromecast, that sort of thing. I wouldn't expect 100% deployment (nor would I expect it to roll out quickly), but if cert-fingerprint-annotation landed in the Fetch spec I wouldn't be surprised to see it make its way into some of those contexts. Even if the spec nominally called for a user prompt, there'd be a decent chance we'd be able to disable it there (much like how we currently disable mixed-content constraints in most cases). In theory, those platforms could provide us with platform-specific APIs for this, but having a standard would make it much easier to argue for.
Amusingly, an Android WebView wouldn't be as much of an issue, since there we'd be able to bypass the browser networking stack altogether and call into a Java-provided fetch function (though we don't use web views for LAN interfacing on mobile platforms currently, and don't plan to).

@estark37
Copy link
Collaborator

estark37 commented Apr 8, 2021

an attacker could just as easily mimic a server instance and send the user through the SSO path

I don't think I understand this attack. Wouldn't the user have to authenticate the attacker's server via the PAKE?

@rcombs
Copy link
Author

rcombs commented Apr 8, 2021

I'm talking about a case where the attacker has directed the user to the SSO page for their server without going through a PAKE at all. The SSO service has no way of knowing that hasn't occurred.

@KenjiBaheux
Copy link

Hmm, yes, though you could probably partially work around this by opening it in a new tab and postMessageing to the opened page and closing it if you don't get a response after some amount of time. Not a great UX, though, because the user would see an error page.

Naive question, does it have to be a new tab?

Although not yet available, I'm wondering if this could be a use-case for Portals which allows for another top-level document to live inside the same tab. The Portal would initially be off-screen / hidden, until its responds to postMessage. Note: potential caveats with regards to privacy motivated restrictions to communication channels (cross-origin).

@letitz
Copy link
Collaborator

letitz commented May 4, 2021

We on the Chrome team (@estark37, @mikewest, @sleevi and I) have met to discuss this. We stand by the recommendations we made before, and can now offer a clearer reasoning for them.

In the short term

Our suggested workaround is to navigate from https://plex.tv to http://$PLEX_SERVER_IP when the router blocks DNS responses pointing to $PLEX_SERVER_IP.

We understand that this makes for a sub-optimal UX in case of failure, but in the common case it should work fine. We also understand that this results in a degradation of the user’s security for the following two reasons. First, it encourages users to initiate Plex SSO flows from untrustworthy origins, desensitizing them to the risk of being phished. Second, this puts https://plex.tv in the slightly awkward position of having to decide whether or not to trust http://$RANDOM_IP with auth tokens.

Another possibility is building on top of WebRTC, but that would likely represent a very large engineering effort, and a simpler solution is coming soon to the Web Platform. This brings me to my next point...

In the longer term

Our suggested solution is WebTransport and its certificate pinning mechanism.

We acknowledge that this represents a fair amount of work, but it should be significantly easier than building on top of WebRTC; our hope also is that some amount of the necessary investment gets implemented as reusable libraries (whether by Plex or someone else). We also believe it especially worthwhile considering the fact that both http://plex.tv and http://$PLEX_SERVER_IP are likely to lose access to more and more web platform features as the platform moves toward encouraging https use in stronger ways over time. Absent Private Network Access, this would likely be a wise investment anyway.

We expect WebTransport over HTTP/3 to ship in the medium term (it has begun an Origin Trial) with mitigations to protect against key sharing and other substandard security practices, including a) a short maximum expiration time for pinned certificates, and b) a browser-specific mechanism for revoking certain keys that have been subject to abuse. In the long term, we plan to try to port these restrictions back to WebRTC to align security models.

What about fetch?

Remains the question posed by @rcombs a few times in this discussion: why not add the certificate pinning capability to the fetch() API, if it is fine for WebTransport and WebRTC?

We believe that the distinction lies in the fact that WebTransport and WebRTC operate outside the origin model of web security, whereas fetch() is firmly inside.

Browser features built on top of secure contexts and the same-origin policy assume that information sharing between two documents with the same https://foo.example origin is acceptable, because foo.example’s owner can share the information through the backend anyway. The browser uses DNS and the Web PKI to authenticate the origin’s owner, and then exclusively relies on the origin for security checks on cookies, the disk cache, socket pools, service workers, and more.

Allowing a different method of authenticating, either to override DNS resolutions or to use alternatives to the Web PKI, effectively requires creating suborigins for certificate identities. This is because state should not be shared between two (sub)origins with different owners, and in the absence of a global DNS or PKI the only proxy for ownership is certificate identity.

Creating per-certificate suborigins is not fundamentally impossible, indeed we previously had a project called - you guessed it - suborigins that did something similar but never got off the ground. Nevertheless it is quite complex, and we would much prefer to support Plex’s current use cases with Web Transport, which is simpler to reason about and doesn't require a complex, risky implementation effort in the browser.

@rcombs
Copy link
Author

rcombs commented May 4, 2021

I did suggest giving pinned-certificate origins pseudo-domains that are guaranteed never to be real public-DNS-resolvable valid-for-cert-issuance domains (e.g. [cert-fingerprint-as-hex].[IP-addr-as-single-label].local or any of several alternative options), which to my understanding solves the origin issue.

Assuming this sticks, I'm left unsure what the point of this spec is, beyond the additional restrictions it adds. The only viable use-case I can see for the new CORS headers is use as an optimization to bypass WebTransport-based I/O and access native media stacks if the user happens to be in a network condition where that happens to work… and perhaps applications where the author has complete control of all networks the application runs on?

So I'm left with a few recommendations for app developers:

  • Don't rely on these new headers being useful; they may be in some situations, and may work on your system, but any application that uses them cannot provide any guarantees of functionality on consumer networks.
    • You'll instead need a custom HTTP/3 stack that handles a hot mess of "HTTP/[something] over WebTransport over HTTP/3", which I'll probably end up having to implement in libh2o, but if you use any other HTTP stack, glhf
    • On the client side, you'll need some sort of client that implements HTTP/[something] over WebTransport, which I'll proooooooobably be writing and might be able to open-source? but idk if there'll be an org interested in publicly maintaining it and addressing issues and PRs
    • For cert renewal, you'll need to generate a new cert on an interval somewhere between 7 and 13 days, keep all clients up-to-date on both the current and next cert fingerprints somehow (grab the new one from the server over a connection provided via the current one? periodically fetch via regular HTTPS from a centralized service? pick whatever's right for your app), and have the server only swap over to the new cert shortly before the existing one expires. Again, have fun.
    • Hey, look on the bright side: at least you don't need CORS preflights!
    • If you need to play media that isn't already in an MSE-friendly segmented format, you might benefit perf-wise from using these headers on a best-effort basis, but you'll need to fall back to something MSE-based if they don't work
      • This proooobably means converting to MSE-friendly segments server-side in realtime, which will probably add some latency and require massive maintenance investments
      • Alternately, you might be able to demux media files within JS and repack into MSE segments client-side? It's been done for MPEGTS and there's no theoretical reason you couldn't do it for, say, ISOBMFF or Matroska; it's just once again a massive maintenance burden. Maybe throw emscripten at libavformat, at that point? I dunno, Chromium's got a perfectly functional copy of libavformat sitting right there in the process's address space that you can't touch.
  • If building an Electron-style desktop application, you may be able to patch the browser engine's networking stack to bypass DNS resolution (and can also likely bypass CORS checks altogether); this is a far more robust and complete solution than anything provided here
  • If running on an embedded system, you can likely disable CORS checks altogether; if possible, contact your platform vendors and request a platform-specific API to allow DNS resolution bypass; otherwise, plaintext fallback may continue to be your only feasible option in some cases
  • If you see all this, your target userbase is tech-enthusiast-y, and you think "wouldn't it be easier to just use HTTP insecurely, or tell users to bypass the cert warning when loading the page", this is the one case where I won't blame you for that

@annevk
Copy link

annevk commented May 5, 2021

The point of this specification is to add additional restrictions. It's not meant for anything else.

@rcombs
Copy link
Author

rcombs commented May 5, 2021

I'm referring to the added CORS headers; even if a highly-determined developer attempts to actually use them, there's an excellent chance they'll discover (possibly only after shipping) that they can't actually be relied upon as a solution.

@letitz
Copy link
Collaborator

letitz commented May 5, 2021

I did suggest giving pinned-certificate origins pseudo-domains that are guaranteed never to be real public-DNS-resolvable valid-for-cert-issuance domains (e.g. [cert-fingerprint-as-hex].[IP-addr-as-single-label].local or any of several alternative options), which to my understanding solves the origin issue.

The issue with an origin-based approach is not the lack of technical options - many have been explored similar to what you propose - so unfortunately this does solve the problem. The problem is that implementing such a change in the browser is a risky and complex endeavor that we do not have the resources to take on right now.

Assuming this sticks, I'm left unsure what the point of this spec is, beyond the additional restrictions it adds.

As @annevk said, the point of this spec is indeed to restrict the capabilities of the web platform to protect the overwhelming majority of users from malicious attacks a la Drive-by Pharming.

The only viable use-case I can see for the new CORS headers is use as an optimization to bypass WebTransport-based I/O and access native media stacks if the user happens to be in a network condition where that happens to work… and perhaps applications where the author has complete control of all networks the application runs on?

For intranet scenarios, it is indeed quite likely that the application author has control over the networks against which the application is supposed to run.

If you see all this, your target userbase is tech-enthusiast-y, and you think "wouldn't it be easier to just use HTTP insecurely, or tell users to bypass the cert warning when loading the page", this is the one case where I won't blame you for that

It seems to me that knowledgeable users could modify the system hosts file to point the Plex subdomain to the right IP. This could be facilitated by a script provided by Plex, to be run with root permissions on first setup. Then DNS queries need not be issued through the router, circumventing the issue?

@letitz
Copy link
Collaborator

letitz commented May 5, 2021

I had an additional thought: could the WebTransport connection be mediated by a ServiceWorker? This might make it quite transparent to the page issuing the requests.

@rcombs
Copy link
Author

rcombs commented May 5, 2021

It seems to me that knowledgeable users could modify the system hosts file to point the Plex subdomain to the right IP. This could be facilitated by a script provided by Plex, to be run with root permissions on first setup. Then DNS queries need not be issued through the router, circumventing the issue?

If the user always had a native application installed on every device they wanted to use the app from, I wouldn't need this to run in a browser.

I had an additional thought: could the WebTransport connection be mediated by a ServiceWorker? This might make it quite transparent to the page issuing the requests.

Hmmmmmmmmm, that looks veeeeeeeeery interesting, and potentially very powerful. I'd expect this would eliminate the user-facing regressions this change would introduce (thanks to ServiceWorkers' integration with the media stack), and at least somewhat mitigate the invasiveness of the client-side changes required, though it'll still be quite the undertaking for every app developer who wants to use this.

@rawnsley
Copy link

I know WebSockets have been mentioned in other threads, but is there a reason they wouldn't work for some of these Plex use cases? Rightly or wrongly WebSockets aren't restricted so far in any of the browser implementations, including Chrome 92.

@letitz
Copy link
Collaborator

letitz commented Jun 21, 2021

A secure context cannot open a WebSocket to a ws:// endpoint, it is blocked as mixed content. While it is true that WebSockets are not currently covered by the Private Network Access implementation in Chrome, they will eventually be subject to the same restrictions as regular fetches, see #14.

@rawnsley
Copy link

Thanks for the clarification @letitz. I'm going to reimplement our system to use WebSockets anyway and hope it takes some time to get closed down. I know I'm late to this party, but it feels like we are breaking a useful and efficient topology because a few printer and router manufacturers haven't bothered to secure their APIs properly.

The official workaround appears to be replacing a simple HTTP server with a complex technology stack in order to connect to a browser technology that hasn't been released yet.

I think this spec means the end of solutions built around local-device-to-browser communication in all but cases like Plex, who can't afford to send large data streams to the Internet and back. Communicating via a server is only more secure in the cryptographic sense as the server owner can now see your data! Without wishing to put on my tin-foil hat: this serves the interests of the big tech companies far more than it serves those of the actual user.

@letitz
Copy link
Collaborator

letitz commented Jun 23, 2021

Thanks for the clarification @letitz. I'm going to reimplement our system to use WebSockets anyway and hope it takes some time to get closed down.

I'm sorry to hear that's the best path forward you see available to you. Please be aware that this will indeed get broken in the medium term.

I know I'm late to this party, but it feels like we are breaking a useful and efficient topology because a few printer and router manufacturers haven't bothered to secure their APIs properly.

Indeed, the root of the problem here is indeed that local servers are particularly juicy and soft targets for CSRF attacks.

The official workaround appears to be replacing a simple HTTP server with a complex technology stack in order to connect to a browser technology that hasn't been released yet.

It's true that WebTransport has not been shipped yet, but it is available via an Origin Trial. Note that it uses HTTP/3 under the hood, so fundamentally we are "simply" upgrading HTTP versions + requiring the use of TLS.

I think this spec means the end of solutions built around local-device-to-browser communication in all but cases like Plex, who can't afford to send large data streams to the Internet and back. Communicating via a server is only more secure in the cryptographic sense as the server owner can now see your data!

I don't believe Plex has any intention of shipping large streams of data to the internet and back. In fact, the point of the WebTransport solution and the redirect-to-local-server workarounds discussed above is to avoid streaming any media over the internet.

All the workarounds I have discussed here and in other threads with impacted developers have upheld this guarantee. Your choice of WebSockets also does. Thus it seems to me that the evidence points the other way: this specification is in fact not pushing developers to round-trip data through the public internet.

Without wishing to put on my tin-foil hat: this serves the interests of the big tech companies far more than it serves those of the actual user.

Security is a numbers game. Protecting billions of users online from router hijacking, e.g.:

... is in their best interest. From https://w3ctag.github.io/design-principles/#priority-of-constituencies:

User needs come before the needs of web page authors, which come before than the needs of user agent implementors, which come before than the needs of specification writers, which come before theoretical purity.

Here, I believe we are rightly putting user needs before the needs of web page authors.

@letitz
Copy link
Collaborator

letitz commented May 13, 2022

Hi folks,

It's been a few months since Chrome has started a deprecation trial for private network access from non-secure contexts. WebTransport has shipped (with serverCertificateHashes support), and web developers have been able to give that avenue some more thought.

What have we learned?

In short, WebTransport is not a "complete replacement for LAN plaintext connections" - it does not fix this issue. Y'all were right!

What's the problem with WebTransport?

The problem is not that it requires more engineering work (though it certainly does) or is not available on all browsers (it is not). The fundamental problem stems from the intersection of WebTransport security requirements and IoT ecosystem requirements.

A dash of background: WebTransport serverCertificateHashes requires the website initiating the connection to know in advance the hash of the target server’s certificate. Somehow, they must both agree on the hash value.

WebTransport requires that if using this mechanism, the server’s certificate expiry be limited to 2 weeks (increasing this limit to X weeks would not resolve the issue). The limit is a mitigation to guard against unsafe certificate practices that are normally prevented by the Web PKI. This means that device manufacturers cannot simply provision a certificate on the device while they still have it, because users cannot be expected to set up the device within 2 (even X) weeks of the manufacturer shipping it.

Instead, the device must generate a new certificate at setup time. Somehow, the website must know the hash of this certificate. The easiest solution is for the device to connect to the internet and communicate its identity and certificate hash to the website’s backend. Then, when the user navigates to the website to set up their device, the website can somehow match the user back to the device - by public IP, by user ID, or any other means. The website can then attempt to connect to the device expecting the right hash.

Unfortunately for us, it seems the IoT ecosystem has been moving away from this model. Our understanding is that in order for a device to get a coveted MFi (“Works with HomeKit”) certification from Apple, it must not require direct internet access for the initial setup.

If the device cannot directly inform the backend of its certificate hash, then both ends must somehow compute the same public key independently. Sounds like a job for TOTP! Except that it requires the device to have a moderately accurate clock, which is often not the case when powering devices on for the first time. Once again, this problem could be solved with NTP or somesuch network-based protocol, but the requirement that the device not talk to the internet precludes this approach.

There seems to be no way out of this bind.

What now?

We propose that secure contexts be allowed to fetch resources from private network servers over plaintext HTTP - circumventing mixed content restrictions - given express user permission.

At a high level, websites indicate their intention to access a private network server using some new API. Chrome queries the server for some identification information it can display to the user in a permission prompt. The user then chooses if they desire to allow the website to access the target server.

This would mean that websites that wish to speak to private network devices must be served over HTTPS. The target device, however, would not have to serve HTTPS. It would only need to respond correctly to CORS preflights, and maybe to some kind of simple identification request.

How would that look like?

This is a rough proposal, some details omitted

We need some kind of API to trigger the permission request when accessing a private network server for the first time. I propose adding a new parameter to the fetch() options bag. For example (bikeshedding certainly required):

fetch("http://router.local/ping", {
  privateNetworkAccess: "private",
});

// Alternatively, to access localhost.
fetch("http://localhost:8123/cats.json", {
  privateNetworkAccess: "local",
});

This would instruct the browser to allow the fetch even though the scheme is non-secure and obtain a connection to the target server. This ensures the feature cannot be abused to bypass mixed content in general.

If the remote IP address does not belong to the IP address space specified as the privateNetworkAccess option value, then the request is failed.

If it does belong, then a CORS preflight request is sent as specified today. The target server then responds with a CORS preflight response, augmented with the following two headers:

Private-Network-Access-Name: <some human-readable device self-identification>
Private-Network-Access-ID: <some unique and stable machine-readable ID, such as a MAC address>

For example:

Private-Network-Access-Name: "My Smart Toothbrush"
Private-Network-Access-ID: "01:23:45:67:89:0A"

A prompt is then shown to the user asking for permission to access the target device. Something like:

prompt

The -Name header is used to present a friendly string to the user instead of, or in addition to, an origin (often a raw IP literal).

The -ID header is used to key the permission and recognize the device across IP addresses. Indeed, widespread use of DHCP means that devices are likely to change IP addresses regularly, and we would like to avoid both cross-device confusion and permission fatigue.

If the user decides to grant the permission, then the fetch continues. If not, it fails.

Once the permission is granted, then the document can make arbitrary fetches to the target origin for its entire lifetime, through media tags, XHRs, etc.

The permission is then persisted. The next document belonging to the same initiator origin that declares its intent to access the same server (perhaps at a different origin, if using a raw IP address) does not trigger a permission prompt. The initial CORS preflight response carries the same ID, and the browser recognizes that the document already has permission to access the server.

What are the alternatives?

Blanket permission prompt

Have a simple "attacker.com wants to connect to the private network: allow/deny" prompt. This resembles what is done in iOS today: https://support.apple.com/en-mk/HT211870

Such a prompt is hard to understand for users. What even is the private network? It also grants too much privilege at once. A single misclick or wrong judgment call (users often click away prompts to get back to what they were doing) means the website is free to scan the local network for attacks and fingerprinting purposes.

Fetch the device info differently

One could imagine the name and ID of the device being served some other way, e.g. at a well-known URL: /.well-known/private-network-access. It would be fetched first, then the prompt displayed, then the CORS preflight would be sent if permission was granted.

In this model, the API used to trigger the permission prompt could be something else than fetch().

This seems to introduce more complexity for not much benefit. Private Network Access already mandates the use of preflights, two additional headers is the easiest for web developers to implement.

Identify the target differently

Instead of having the target provide a unique ID, we could use:

  • the name: subject to collisions if two similar devices exist on the same network
  • the IP address: subject to collisions and churn with DHCP
  • the MAC address: does not work if the target is not directly reachable on layer 2
  • a public key: complex, would need to design a new protocol to make use of the key later

Ephemeral permission

Removes the need for an ID. The permission only lasts as long as the document who requested it. This would result in permission fatigue for services such as Plex, where the same target is accessed repeatedly.

What's next?

We are interested in feedback on this proposal. Would the proposal address developers' use cases? What do other browser implementors think of it? Replies welcome!

@sleevi
Copy link
Contributor

sleevi commented May 13, 2022

Removes the need for an ID. The permission only lasts as long as the document who requested it. This would result in permission fatigue for services such as Plex, where the same target is accessed repeatedly.

Does it, though? Are you designing for persistent no-internet, or only for as long as necessary for bootstrap?

@rcombs
Copy link
Author

rcombs commented May 13, 2022

This seems like it would address first-run scenarios and allow developers to implement TOFU setups, which is nice, and I'm sure will get a decent amount of use (especially from small-scale or noncommercial projects). I do think that a lot of IoT scenarios could benefit from also supporting TLS-SRP in these cases (a la HomeKit, replacing the TOTP case you mentioned), so that developers can provide some level of security via a pairing code indicated on the device, its packaging, or some other mechanism. Even in cases where providing such a code isn't possible, this could benefit from opportunistic encryption to provide some baseline protection against passive snooping.

For Plex, I think we might be able to use this in some limited TOFU first-run configuration cases (for platforms where we're unable to communicate an out-of-band setup code to the user), but I'd be concerned about using it past that without some way to negotiate an authenticated TLS connection. While unauthenticated connections may be required for first-run for some devices/apps, they should never be required past that.

@letitz
Copy link
Collaborator

letitz commented May 18, 2022

Thanks for the responses!

Removes the need for an ID. The permission only lasts as long as the document who requested it. This would result in permission fatigue for services such as Plex, where the same target is accessed repeatedly.

Does it, though? Are you designing for persistent no-internet, or only for as long as necessary for bootstrap?

In Plex's case the problem stems from the user's router filtering out DNS responses in a misguided attempt at preventing rebinding attacks. As long as the user owns that router, plex.tv will be unable to access the local device over HTTPS. When the user wants to stream content from the local device, the only option is to speak HTTP, every time.

I do think that a lot of IoT scenarios could benefit from also supporting TLS-SRP in these cases (a la HomeKit, replacing the TOTP case you mentioned), so that developers can provide some level of security via a pairing code indicated on the device, its packaging, or some other mechanism.

True, and I would like the web platform to eventually support some kind of protocol like TLS-SRP for fully offline secure local communications. With that in hand, we could remove this mixed content exception. Unfortunately, that's a rather large undertaking.

Even in cases where providing such a code isn't possible, this could benefit from opportunistic encryption to provide some baseline protection against passive snooping.

Sure. Do you mean that the website could encrypt things above the HTTP layer?

For Plex, I think we might be able to use this in some limited TOFU first-run configuration cases (for platforms where we're unable to communicate an out-of-band setup code to the user), but I'd be concerned about using it past that without some way to negotiate an authenticated TLS connection. While unauthenticated connections may be required for first-run for some devices/apps, they should never be required past that.

I thought Plex used plaintext HTTP as a fallback when the user's router prevented plex.direct domains from resolving to local IP addresses? Or do you rather suggest that users use port forwarding in that case?

@t2t2
Copy link

t2t2 commented May 18, 2022

Since the thread has been Plex use-case heavy, I'd like to point out a open source project that is also facing this issue - obs-websocket, a plugin for OBS (yes that streaming software) that allows controlling it over websocket. One of the main use cases for this is creating (sometimes web)apps that can be used from another device (eg. phone or tablet) - stuff like switching scenes, controlling volumes, ... useful when you don't want to alt-tab from a game or something and don't want to buy a stream deck

(While not a representative of the project, I maintain one of the browser based frontends built on it and js client library from v5)

https sites being blocked from connecting to ws have been a frequent issue in user support and for maintainers of the plugin doing the plex approach of providing everyone their own certs has been deemed unfeasible. This has lead to users bypassing the issue by proxying websocket connections through ngrok or pagekite as they provide https, however also exposing the websocket to the open internet (via a hard to guess URL, but still, bad)

For web apps, this has lead to relying on serving over http for maximum ease of use, not allowing creating PWA-s (be it for app-like experience or for offline support, sometimes asked for using at events in only local) and not being able to use some features only supported in secure context


As most apps just use ip:port and password (set by user or pregenerated by the plugin) to connect there isn't a central registry to ask where to connect or what cert to use like plex is able to do with user accounts. So a cors preflight with prompt for user permission would definitely be the preferred implementation. And since quite frequently it is used to connect to a specific instance, remembering it long term would absolutely be better for users.

@bakkot
Copy link

bakkot commented Jan 27, 2024

People following this thread may be interested in Chrome's intent to ship the permission + preflight-based feature described above.

@backkem
Copy link

backkem commented Jan 27, 2024

The proposal for the Local Peer-to-Peer API aims to introduce a solution for LAN communication. It's quite similar to what @rcombs alludes to above.

Any feedback from potential uses would be highly appreciated. I'll also explore synergy with this spec. @rcombs is there still intrest from Plex? @t2t2 how about OBS?

For those interested in IoT use-cases there is also interest in exploring comparability with Matter

@rcombs
Copy link
Author

rcombs commented Jan 27, 2024

Unfortunately, I'm no longer at Plex, so I can't confidently say what they'd be interested in today, but I can evaluate these proposals based on how I'd have been able to make use of them when I worked on Plex Media Server; I'd imagine that my concerns here will be relevant to other comparable applications as well.

The prompt + preflight-and-header approach from the current version of this repo appears to resolve the first-run setup use-case to some extent; at minimum, it allows secure-loaded web apps the same provisioning capabilities that were previously available only to insecurely-loaded ones, which is a major improvement. The main caveats are:

  • It doesn't seem to provide any secure connection mechanism, so any authentication or encryption would need to be built on top of plaintext HTTP, and would be extremely challenging to implement for general purposes; this essentially means that uses of this would ~always rely on the security of the LAN
  • It's limited only to Fetch, and requires specific API parameters to be passed, so it wouldn't be possible to do direct <video> playback using this, and integrating it into MSE players (often provided by third-party libraries) could be challenging

Altogether, the proposal seems like it'd probably allow for TOFU setup cases, but I wouldn't have been able to use it at Plex for ongoing routine app usage afterwards (and even if it was possible, I'm not sure if I'd have been comfortable doing so).

The LP2P API is very intriguing. The current details listed there are somewhat sparse, but the "Local HTTPS" section seems like it probably does exactly what I'd have wanted: provides a specialized API for initial mutual authentication with the server, then allows regular web APIs to communicate with the server securely over ordinary HTTPS by using a TLS certificate with a negotiated fingerprint. I have a few questions about it:

  • What address would the client use to communicate with the local HTTPS server? Something like https://[fingerprint]._openscreen._udp:[port?]/?
  • If the client and server have some way of authenticating each other without user input (e.g. a shared secret received from a cloud service and cached locally by both, or some form of mutual proof-of-private-key-possession auth), would they be able to set up a connection that way, or would that always require opening a user prompt for a PAKE code? The PAKE route might not always be viable for per-session auth with headless devices.

If there's somewhere else more appropriate to discuss the LP2P case in more detail, I'd be happy to chat more there, and also to pull in my former colleagues at Plex.

As for my current professional position, I'm not aware of any current interest in this kind of API at Discord (though I'm speaking purely on my own behalf here), but I could imagine a number of worlds where such an interest materializes in the future, so I'll want to keep watching this space regardless 😄.

@backkem
Copy link

backkem commented Jan 27, 2024

Wow thanks for the detailed reply, this is invaluable to us 🙏. If you'll allow me, I'll follow up it in WICG/local-peer-to-peer#39.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests