-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Origin of blob URLs doesn't match what implementations do #127
Comments
I think the standard has a better approach. Otherwise looking up an aspect of the URL involves going through a global lookup table. Note that the origin of a blob URL is determined at creation time and can therefore be stored in the blob URL (as the File API prescribes). So extracting it from the URL is fine too. And modifying it doesn't work since the entire URL is used as key. |
Also, note that in https://bugzilla.mozilla.org/show_bug.cgi?id=1270451 Gecko fixed this. |
Which standard. Currently the file API spec and the URL spec disagree here. According to the url spec the origin of a blob url with an opaque origin is a new unique opaque origin, since its created from the serialized url. While according to the file API spec the origin should be the same opaque origin as that of the incumbent settings object that created the URL. Of course both would serialize to "null", so from the javascript URL API point of view this wouldn't matter, but when other parts of the spec try to determine if URLs are same origin this does matter. In particular for things like creating workers from blob URLs. But also other situations. Should a blob URL be same origin with the setting object that created them or not? I think yes, and I believe implementations agree (at least chrome does). The File API spec certainly agrees. The current URL spec seems to be the odd one out. |
Not sure what exactly gecko fixed. Are in latest gecko blob URLs no longer same origin with their creating context? I would find that very surprising and broken. And in particular "What is the problem with the model from the URL standard? It seems to work fine in Chrome." makes no sense. Chrome does not implement the model from the URL standard as it doesn't work. |
I don't understand. The File API defines the origin of a blob and puts it in the URL it creates. The URL Standard extracts. I don't think there is a mismatch. |
The file API defines not only the origin of a blob, but also the origin of a blob URL. And defines the origin of that URL to be the same as that of the context that creates it. In particular when that origin is an opaque origin the origin of the blob URL needs to still be the same origin with it. That is impossible to do through serialization and deserialization via the URL itself (separately the FileAPI spec doesn't actually define how implementations should serialize an opaque origin in a blob URL, it leaves that implementation defined). But some out-of-band data (like the already existing Blob URL store) is needed to actually have blob URLs be same origin with the context that created them. |
I see, opaque origins. I think File API should not serialize those other than as "null". Seems like we might reveal more than we want otherwise. What does this mean in practice for opaque origins though? What scenario is broken and is that a problem or a feature? |
I think in practice browsers do indeed serialize opaque origins as "null" in blob URLs, even though the spec might allow other things. So that part is fine. One of the features we'd lose if blob urls would no longer be same origin with their creator would be the feature discussed in whatwg/html#1322, the ability for a sandboxed iframe to start a worker using a blob URL. Something else that would break without this same originness is using fetch to read blobs in opaque origins (but creating a blob url and then fetching that). That more or less is why blink/webkit originally implemented this lookup for the origin (although in chrome it is slightly worse since the entire FileReader API is implemented in terms of fetching blob URLs. So FileReader was broken in opaque origins without this origin mapping. But that's more an implementation detail). |
@bakulf thoughts? |
Okay, @bakulf seemed fine with reverting Firefox to the earlier behavior. I'll need to file a bug. So just to confirm, @mkruisselbrink, the idea is that we'd do some kind of global lookup when determining the origin of a blob URL. Does that effectively mean the blob store is per unit of related-origin browsing contexts? |
Probably per event loop, since they work in workers as well? I dunno, I just heard that phrase and am popping my head in. |
The blob store concept as defined in the latest version of the File API spec definitely needs work, as it doesn't actually reflect what browsers do (and also doesn't seem to match what older versions of the File API spec did). Having it per execution context as in the current spec certainly doesn't seem like it would make sense, as using blob urls in other execution contexts is definitely something that works today. I believe in chrome the thing that best reflects the "blob store" concept is actually fully global. But then the blob url -> (possibly opaque) origin map chrome also maintains is more per event loop. I really need to find the time to figure out what browsers are actually doing in this space, and then figure out how to best write that down in the spec, as what is currently in the specs about blob URLs is all over the place and certainly doesn't match reality... I should be able to find some time for that in the next few weeks. |
I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1282504 to get Firefox back to using a store again for the origin. Leaving this open until we figure out what the scope of that store is. |
@mkruisselbrink so in Chrome |
In Chrome it first tries the blob URL store, and if that doesn't have an entry it parses the origin out of the URL string. |
@mkruisselbrink would you be open to remove the fallback? I'm not sure how having the fallback makes sense if we use the store anyway. |
@mkruisselbrink ping! We should really start fixing this. |
Not sure about not having the fallback in getting the origin for a blob URL. I guess it sort of makes sense from a spec point of view, but what would you return then if a URL doesn't (or no longer does) exists in the Blob URL Store? Presumably a new unique origin? It would then allow code to check if a blob url is (still) valid and not revoked by getting its origin? Not sure I like that. Also note that from the point of view of new URL(some blob url).origin it's indistinguishable if the origin is always extracted from the URL, or if it is first looked up in this map, and if not found it is gotten from the URL. Only in places where same origin checks are done with the unserialized origin (particularly for opaque origins) is there any difference. In blink the store used to map url to origin is currently per-thread, which is problematic in that things that should work between workers and documents don't always work correctly. I think it makes the most sense for the Blob URL Store to conceptually just be completely global. Of course implementations can scope it smaller since blob urls shouldn't ever be resolved cross origin anyway, but no reason not to spec it in the simplest way? (hmm, the File API spec says non-normatively that all cross origin requests should be errors, but from the fetch spec it seems like a no-cors cross origin request would be allowed? I wonder what implementations do...) |
Isn't the blob URL store lookup already exposed synchronously somewhere? (I don't recall offhand how though, so maybe not.) Also, if you put a blob URL in a new browsing context, why would that not work? Can't the user open them somewhere else that might be a new thread/process? |
You can already determine synchronously whether a URL is in the blob URL store: const blob = new Blob(["test"]),
url = URL.createObjectURL(blob),
client = new XMLHttpRequest
URL.revokeObjectURL(url)
client.open("GET", url, false)
client.send() I suppose you could make an argument that we only look up the origin in the blob URL store when it turns out to be opaque from parsing the blob URL. As that would give slightly better performance probably. I would be okay with such an approach, though it still seems a little worse than simply always looking it up or never looking it up (which is not an option as that would break opaque origin use cases). |
@mkruisselbrink thoughts? What's your timeline for defining the blob store properly? |
This fixes whatwg#290, fixes whatwg#127, fixes w3c/FileAPI#20 and fixes w3c/FileAPI#63.
This fixes whatwg#290, fixes whatwg#127, fixes w3c/FileAPI#20 and fixes w3c/FileAPI#63.
In particular some blob: URLs representing an opaque origin could still be same origin with something. That only works if the origin is stored somewhere and is not obtained from the URL. Tests: see #371 (comment). This fixes #127, fixes #290 , fixes w3c/FileAPI#20, and fixes w3c/FileAPI#63.
As mentioned in Origin of Blob URLs, the origin of a blob URL should be the origin of the incumbent settings object when the URL was created. The simple "parse the origin from the url in the first string of the path of the blob url" algorithm the URL spec uses doesn't accomplish that for opaque origins.
The way at least chrome implements this is by maintaining a separate blob url -> origin mapping, and using that to lookup the origin of a blob url. I assume firefox does someting similar (see also whatwg/html#1322 for an example where this behavior matters).
Should the URL spec be updated to actually match the File API spec about blob URLs, and to match what implementations seem to do?
The text was updated successfully, but these errors were encountered: