-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[spec] Normative: Support [Serializable] for WebAssembly.Module #711
Conversation
document/web-api/index.bs
Outdated
@@ -2,7 +2,7 @@ | |||
Title: WebAssembly Web API | |||
Shortname: wasm-web-api | |||
Group: wasm | |||
Status: FPWD | |||
Status: w3c/FPWD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was needed for me to be able to build locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you update your bikeshed version that should fix it (it will then know wasm is a w3c project).
The reason I changed these is that bikeshed rejects some of the normal stamping with the full string. I still owe Tab filing an issue on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should I update it? I tried bikeshed update
and it still gave this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to actually update your Bikeshed source code, so e.g. go to the director you checked out bikeshed and run git pull
.
document/web-api/index.bs
Outdated
@@ -125,6 +135,36 @@ Note: This algorithm accepts a {{Response}} object, or a | |||
1. Return |returnValue|. | |||
</div> | |||
|
|||
<h2 id="serialization">Serialization</h2> | |||
|
|||
On the Web, {{Module}} has the <code>[<a extended-attribute>Serializable</a>]</code> extended attribute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this be in its IDL? This seems somewhat unclear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the JS API specification is trying to avoid referencing anything which is Web-specific. [Serializable] is defined in HTML. I'm not sure how to put this in its IDL without violating that layering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these objects are not defined in IDL then you need to patch HTML directly as you did with BigNum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I said as much on the original PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are defined in IDL, it's just that that IDL is supposed to not reference Web-specific specifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the IDL: https://webassembly.github.io/spec/js-api/index.html#module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also has Exposed=(Window,Worker,Worklet) so I think you're failing somewhat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@annevk Well, there's a plan for that: whatwg/webidl#468 . This layering is still a goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I think you need to make this something like "Web user agents must augment the Module interface with the [Serializable] extended attribute." so it's an actual requirement and not a statement of fact that isn't based on any established facts...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the wording; I've switched to use it.
document/web-api/index.bs
Outdated
|
||
On the Web, {{Module}} has the <code>[<a extended-attribute>Serializable</a>]</code> extended attribute. | ||
|
||
The [=serialization steps=], given |value|, |serialized| and |forStorage|, are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oxford comma?
document/web-api/index.bs
Outdated
|
||
Given the above engine optimizations, structured serialization provides developers | ||
explicit control over both compiled-code caching and cross-window/worker code | ||
sharing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a note?
document/web-api/index.bs
Outdated
1. [=Compile a WebAssembly module=] from |bytes| and set |value|.\[[Module]] to the result. | ||
|
||
The semantics of a structured serialization is as-if the binary source, from which the | ||
{{Module}} was compiled, is serialized, then deserialized, and recompiled into the target realm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a note as well.
This is a spec addition, right? I think you need to follow the same process to get this addition into WebAssembly. Should be very straightfoward, but we can't commit it until the working draft ships (unless we want to re-spin a new public working draft, and all that entails). |
Hmm, I sure hope this group has the ability to publish new editor's draft without going through the W3C working draft process stuff... |
This PR is very confusing BTW because it says "Memory" several times (e.g. in the PR title) when I believe the actual spec text is for "Module". |
Maybe I misunderstand, but IIUC this PR requires me to change code in WebKit. I agree with the change, but it's not editorial. It's a new capacity, and as such should go through the CG as documented in our process. Doesn't matter if it's small and obvious, we can't just commit non-editorial changes to the spec. All work would then iterate in a separate repo, which we'd merge into the spec when it's ready to be merged. Separately, even if we go through the process, I'm worried that merging this change into the tip-of-tree of this repo does require going through the W3C working draft process. A way to avoid this is to work purely in a branch for each working draft. This change then wouldn't affect the draft. |
This was content that was present in the design repo material before it got converted to our current spec. It was lost in transcription as the old verbiage was very informal (and Dan had the impression it got moved to the HTML spec). We could certainly churn through the proposal process, but I would argue that's unnecessarily bureaucratic. |
Brad's explanation makes sense. Agreed. |
@domenic Oops re: Memory. Fixed up the cases I saw. @jfbastien I think there are two discussions here and we're talking past each other. One question is whether this is a new feature. I don't think anyone is saying no to this. This definitely isn't an editorial change, it's a normative change and new feature (which might deserve to cut to the head of the line for @flagxor 's historical reasons). I'd be fine asking the Wasm CG for a repo here, but sounds you're not asking me to. What @domenic is suggesting is that the WebAssembly CG may want to maintain two documents in parallel: A FWPD (what you've already sent out, leaving it frozen), and an "Editor's draft" (W3C speak for a document that you can just edit whenever you want). Basically, the standard that you put through the IP opt-out period and the other formal processes that follow it would be a fork of the editor's draft, which we'd occasionally backport editorial patches to or whatever else is urgent. A nice thing about maintaining an Editor's Draft is that people working on new features don't have to block on W3C process to get their work done. At the same time, people who are focusing on W3C process have a stamp-ready, feature-filled document ready for them once the cycle opens up again--there's no need to "open the floodgates" and merge a bunch of things at a particular time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but defer to folks that know the interop with the html spec better.
Correct, if this is just putting in something that was agree on, and that was dropped, then I don't think you need to go through the process. I encourage you to provide this type of context next time, so non-editorial changes are easy to categorize.
Agreed, it's simply more maintenance to keep two branches going. |
OK, thanks for the reviews everyone! I'll put this on the agenda for the next Wasm CG meeting so we can discuss next steps. |
It seems like web-platform-tests already has good tests for module serialization, which verify that
Two things from here:
|
So my recollection was that there was disagreement around requiring a secure context for IDB serialization (from JSC + Firefox)? Chrome has taken the position that persisting code from a potentially insecure context is problematic. I think the agent cluster check just fell out of the assumption that code sharing was most essential between threads. I'd be happy dropping it if others don't care? |
Given the past disagreement on this security question about storing code from insecure contexts, how should we proceed? Should the specification permit the various positions taken by different browsers, or should we come to a common, single semantic? |
The security question is specifically about caching compiled code in IDB from an insecure context, not serializing, correct? What do other web caching things do? |
We must become interoperable on this web-observable behavior. Other modern caching things (e.g. the service worker cache API) are not accessible from insecure contexts. |
@jfbastien "Serializing" is the name in the HTML specification that describes how things are stored in IndexedDB as well as postMessage. There is a forStorage mode switch which specifications can use to differentiate those two cases. |
If other modern developer-controlled caches are spec'd to disallow caching from insecure contexts then it seems like we should follow the same rationale unless their motivation to do so doesn't apply to WebAssembly caching. Specifically, what's the bad stuff that an attacker can do if they can get a module cached? Can they permanently inject content into a site? If they decide on the IDB key then the answer is "yes", but if the key is specified as something they can't control then "no". |
Right, it's about injecting content into the site. If I understand correctly the scenario, under insecure HTTP, an attacker can control everything, including the IDB key. |
As we discussed in the CG call today, I'd like our decision to mention HTTP / HTTPS to be a principled one, not just "hey we could do it for IDB". It sounds like at least Chrome wouldn't allow IDB serialization under HTTP. I'd rather spec any secure context restrictions separately, following a WebAssembly overall guidance. Maybe this can be an offshoot out the secure context discussion? |
We heard in the call that Chrome is unlikely to ship the ability to store Modules from insecure contexts in IndexedDB. If it's possible, I hope we can soon come to agreement among all the browsers that what we specify here will eventually be shipped. To ensure compatibility, I'd prefer to aim to be only adding features over time, and not shipping additional restrictions on previously shipped features unless it is to correct a previous mistake. (If there isn't compatibility risk from a future restriction, it's because the feature isn't being used; in that case, why bother specifying and shipping it?) I think additional restrictions are a decent conservative default given disagreement. In this case, I could see logic to the restriction--persistent storage of attacker code is worse than ephemerally running attacker code on a particular page load. |
I don't really see Chrome not shipping this for HTTP as a big issue since all sites have to handle IDB storage failure already, and not all browsers currently support IDB module storage. The incompatibility is indistinguishable from the current state of the web, as well as from random failure. If you want to aim for specifying HTTPS for module serialization, that's also fine with me. As I've outlined I think we should do so in a principled manner, not ad-hoc for this proposal. We can block serialization of modules on figuring out a precise policy for how WebAssembly should act under an insecure context. Should we discuss this with webappsec alongside CSP, or as a separate thing? |
If we're OK with unpredictable performance, I don't understand the motivation for this feature. Browsers could just transparently cache modules based on a hash of the bitcode. I thought the idea of serializing modules in IndexedDB was that you'd be able to fairly reliably control when they are cached and read out. There's a bit of difference between a permanent state of differences between browsers and other cases where this feature won't be available: when comparing to the current state of the web, this is hopefully a one-time upgrade, and then all browsers should support IDB persistence of modules reliably. For eviction from storage, this can be controlled by the persistent storage API. For random failures, these should be pretty rare and analogous to a first page load; it should be pretty rare for a second load immediately after that to face the same issue. |
@littledan I'm making a very narrow argument from the current state of which browser ships what. From that current state of things, there's little difference in a world where HTTP and HTTPS don't act the same, whether normative or not, because already different browsers don't act the same and already there's random eviction. Longer term (say, later this year) we'll figure out how to handle HTTPS in this spec. I just don't think an ad-hoc solution for this specific feature is right for the long term, because I'm not sure we'll get the big picture right (we might, I'm simply not confident that we will). Otherwise we'd already have that big picture in the spec. |
We discussed structured clone of modules in the April 4th WebAssembly CG call. To summarize the discussion:
I'll follow up with a patch which maintains the secure context restriction and follows the other restrictions listed above. |
Not cross-site, but they can go cross-origin. |
@annevk Are you talking about the postMessage case? Can you give an example/reference for more details about the cross-site/cross-origin differences and how it relates to agent clusters? We were having trouble in the last CG call. |
@littledan if example.com embeds site.example.com through a frame they'll be in the same similar-origin window agent. That's because if they both set Basically, an agent cluster by definition can consists of multiple origins, but those origins always have the same eTLD+1 (registrable domain). |
Thanks for the explanation. Seems like we should discuss in the CG whether we want a same or similar origin restriction here. |
@littledan that would require storing origin data on the |
@annevk What do you think about this current patch, which stores the agent cluster if it's not forStorage (similar to SharedArrayBuffer, but for different reasons). |
@littledan I think that's the correct approach. Though writing that down I do now recall that folks raised a concern in private with me about that approach combined with |
Apologies for my delay. Chrome is considering shipping same-agent-cluster postMessage but not IndexedDB serialization of WebAssembly modules. I've updated this PR to match that proposal. See 314099c for more information. |
The [=serialization steps=], given |value|, |serialized|, and |forStorage|, are: | ||
|
||
1. If |forStorage| is true, throw a | ||
"<a exception>DataCloneError</a>" {{DOMException}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "{{DataCloneError}}"
is more conventional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good if there was a MessageChannel test where port2 was transferred cross-site. Then invoke port1's postMessage(). Then transfer port2 back same-site. Then invoke port1's start(). And some variations on that theme.
document/web-api/index.bs
Outdated
sharing. | ||
|
||
Note: The web platform does not permit cross-origin sharing of serialized Modules: | ||
Agent clusters in HTML do not span multiple origins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. It does not allow sharing when two origins's hosts are not https://url.spec.whatwg.org/#host-same-site or do not have the same scheme. Otherwise sharing is possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is also why we call it the similar-origin window agent, not just window agent.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything linkable that I could reference for "similar origin", or is this just something that's used in the air but not documents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just removed the note.
…serialization This restriction was non-sensical because there is practical no way to serialize an agent cluster forStorage.
…odule deserialization" The same agent cluster restriction is present in the case of serializing within memory. Also add a note about how this restricts serialization to be within an origin.
Based on discussions in the WebAssembly CG, it sounds like IndexedDB serialization of WebAssembly Modules has the following problems: - On some engines, it's impractical to serialize more than just the binary module data - On some engines, upgrading from one version to another would cause the serialized module to be invalidated, with complicated implementation logic and a loss of reliability - Some engines plan to just return an error for IndexedDB module serialization as a design decision The purpose of IndexedDB Module serialization is for reliable performance, and these items provide risks to that. This patch proposes that, initially, IndexedDB serialization would not be provided for WebAssembly modules.
When do we want to consider landing this PR? Should it be in the v1 spec, or are we OK with waiting for v2? It's a significant new feature, so I'm not sure if it might reset some sort of clock. @lukewagner suggested that waiting for v2 might be fine, as most of the use cases for postMessage are motivated by threads, which are only coming in v2. @Ms2ger Would you be interested in looking into the tests which @annevk suggested? |
We discussed having this for v1 already, and only decided recently not to move forward with serialization for IndexedDB part. I think it makes sense to keep for v1.
It will, yes, but we're past the point where that matters. We need to put out a new version regardless which will necessarily reset the clock. |
OK, let's discuss this at the next CG meeting tomorrow and merge it if it seems OK |
We discussed this in the June 26th CG meeting and decided to include this PR in v1 of the spec. |
@Ms2ger Could you make sure we have good tests for this patch in web-platform-tests? In particular, we should have tests that serializing in IndexedDB throws, and that serialization in postMessage and MessageChannel works. |
Looks like web-platform-tests/wpt#12488 covers this neatly. |
This patchset adds support to WebAssembly.Module to be used in HTML serialization, including from
postMessage
and IndexedDB. The change is based on @flagxor 's earlier PR, with the following changes: