Attendees:
- Dan Ehrenberg (DE)
- Surma (SUR)
- Shu-yu Guo (SYG)
- Mark Miller (MM)
- Chip Morningstar (CM)
- Jack Works (JWK)
- Yulia Startsev (YSV)
DE: Surma has a list of topics.
SUR: Something that came up again recently is whether we want to cache modules by source position. The current spec draft does not cache by module position but there are examples where we might want to. We can still talk about whether we want a Module constructor like a Function constructor (currently doesn't have). Import.meta.url is another topic: what does import.meta.url look like from inside a module block? Currently this value uniquely identifies a module and module blocks may or may not break this property. Object URLs (?) should not be extended, Anne VK expressed opinion that each API that works with module blocks should just work with module blocks. Finally, we can talk about whether we should make the Worker constructor accept module blocks. There are concerns of creating too many workers, thus creating a performance footgun.
MM: I'm surprised that the very central concern is the convergence with the static module records from the Compartments proposal. Static module records have a constructor that lets you construct it from a module source code. It precisely is a reified linkable static module that you can then play into the parameterization for the import namespace. To me that's the central exciting thing about module blocks is how it's convergent with the independently arrived at static module records.
DE: We can start with MM's topic and go down the list of SUR's topics. On MM's point I agree these are convergent. My current understanding of both proposals is there's no real changes need for both proposals to line up. Even if we make the module block constructor throw today there's nothing preventing us from adding it later that accepts string. Similarly for introspecting the namespace, can be layered on later.
MM: I agree on the namespace. I don't quite understand the constructor point. Last I looked there's no constructor, only syntax for module blocks. I'm not suggesting we converge in one step. I think I'm agreeing with you that we let module blocks go forward by itself, and then elaborations can come with the Compartments proposal. But that we do not paint ourselves into a corner and my design constraint is that we can converge happily.
DE: When we were writing it out we were talking about what's the prototype of these? What's the constructor? The constructor was immediately controversial because now there's another place where you can compile strings. People felt strongly both ways. My view is not initially supported but supported later: initially throw.
MM: I think that's reasonable. Assuming there's a version of the spec that goes out with module blocks in it but not yet compartments, there would need to be strong advice (not sure how normative) not to count on the throwing behavior because we expect that to be something else.
DE: We can probably have a note.
SUR: Throwing would need to be handled in any codebase anyway because of syntax errors. I don't think we're causing compat issues for code that rely on throwing.
MM: For example, Function ctor has been used in places as a syntax check for a function body.
SYG: I'm not too worried. If it always throws, I can't see what reasonable dependencies would form. Additionally web platform hasn't considered throwing -> not throwing as a compat concern in principle, so hopefully the rest of the ecosystem got the message.
MM: There're all these concerns around CSP. Once you have a module block object, the ability to link and instantiate shouldn't be constrained by constraints on evaluation like CSP, which I imagine it's mostly about going from string inputs. Analogously to Function constructor, we don't constrain calling the function object, just whether the reification of the constructor succeeds.
DE: Module blocks can import other modules so that can lead to execution of other code.
MM: The complexity you just raised is a great example for why this is a good issue to postpone. The issue of what it means to meaningfully suppress evaluation in a context where imports through loaders can cause other code to be loaded is not a trivial issue. Another one that's mentioned is import.meta. I don't have a clear sense as to how this plays into the Compartment import.meta hook and at what point it gets bound. Does this get bound in the module block expression it was in? Is it instead the Compartment context that links and instantiates the module block?
DE: You're raising a series of questions that are in SUR's agenda. Let's go through that list in order. Let's start with import.meta. In spec draft there's nothing stated because import.meta is in the HTML spec. The issue with import.meta.url is there're two possibilities. 1) Lexically enclosing module's import.meta.url. If you have a module block it's always inside another module somewhere and we can reuse that one. 2) Or we can reuse that one and add an extra fragment. Whatever we do it's important to preserve the ability to create new URLs relative to the source location, which is currently done via import.meta.url.
MM: It's a strange place to start since ecma262 doesn't have an import.meta.url, only an import.meta. What's the proposal for the rest of import.meta?
DE: You're right this is not a TC39 thing. We'd create this proposal in HTML. HTML only has import.meta.url right now. We wouldn't add any extra properties.
SUR: There is import.meta.resolve as a side note: it resolves paths. More interestingly there's some code in wild that relies on import(import.meta.url)
for reliably importing yourself. Probably something we should try to uphold for module blocks, so 1) above might not be insufficient.
DE: A fragment is insufficient too, since it'd also just import another copy of the parent module.
MM: There's another space where source location comes up in ecma262: in Error tracebacks, but this is only de facto. Jordan and I have a non-specific proposal that leaves definition of the traces up to the host. Clearly there's established practice about source locations. There're these TC39-adjacent places where source locations do come up in the language.
CM: I'm not sure what the issue is. Who cares about the source location?
SUR: In web dev if you fetch any resource or use any form of a path, by default they're relative to the base url. The base url is where the HTML file is. The pattern that's emerged is to rely on import.meta.url to make URLs relative to the module you're currently in. The other pattern is importing yourself. Not sure what the use case is but I've seen it done. Primary use case is secondary resources, like images. For those use cases we need to make import.meta.url have a sensible value. The concern is if we copy the parent module's URL then module blocks' URLs wouldn't be uniquely identifying.
DE: I think the importing self use case already isn't true. I think we should use the URL of the surrounding JS module.
SUR: With or without a fragment?
DE: I wouldn't. We don't make up fragments in general.
JWK: So only HTML chooses to add it?
DE: This would be part of HTML standard, not JS standard.
CM: Without a fragment, if you have two different module blocks in a single block of HTML wouldn't they have the same URL?
DE: Yep.
CM: So what's the URL telling you?
DE: It's telling you where the outer resource it came from. Same thing happens if an HTML file has two inline script type=module tags inline. Tells you the URL of the file that contained it.
CM: So this is not "where am I" but "who summoned me"?
DE: It's not who imported you, it's where the source text comes from.
CM: If you have two different fragments of source text that came from the same place, you can't distinguish them from each other and that seems weird. I'm mostly expressing these opinions out of confusion. I have no dog in this fight.
SUR: That's exactly the concern: you can't distinguish. Is this an actual problem in practice?
JWK: I think maybe we can do some research. Let's search in-the-wild use of import.meta.url and see what people do to the URL.
SUR: That'd be great if we could find examples.
DE: Let's follow up in GH. I like Jack's suggestion for research.
DE: Let's move on to the caching topic. Currently, each time you evaluate a module {}, you get a new copy of the module block. If you import it different module blocks you get new entries in the module map. But should it be cached by source text location? V8 and SM engineers have said in the past caching, like for templates, is a pain and would like to avoid that.
SUR: Here're some examples where caching by source text might be beneficial.
onclick = () => runInWorker(module { ... });
// React hooks:
const [...] = useWorker(module { /* ... */ });
// Terminology:
const arr = [];
for(let i = 0; i < 10; i++)
arr[i] = module { /* ... */ }; // <- *Static* Module
await import(arr[0]) == await import(arr[1]); // <- Module *Instances*
// From Shu:
runInWorker(() => {})
// vs
let h = () => {}
runInWorker(h)
// analogous:
const m = module { ... };
onclick = () => runInWorker(m);
// Let’s pretend there’s an actor model library
for(let i = 0; i < 10; i++) {
spawnActor(module {
});
}
Suppose we have onclick = () => runInWorker(module {})
. If each module {}
evaluates to a different one, each click makes a new module, and over time you'd accumulate many many modules. This could be solved by caching source location. But more I think about it I think caching is unexpected for devs. Analogously for closures and object literals, they're not cached, and people know to not do this on hot paths. Going further, if we had an actor library: if where the actor runs is abstracted away, if you deduplicated by source text you can end up with two actors on the same thread with the same module, which might be unexpected. But the point stands that writing runInWorker(module {})
is just so ergonomic to write. After talking with folks I'm convinced the current non-caching behavior is correct, and there's some outreach to be done. I think there's also an unknown here: I don't actually know how much memory a module instance consumes.
JWK: I think we should not create a new module instance because I thought the original motivation was to make it statically declare a new module. In the for-loop example, there's no need to have a fresh module every time so cache by source location seems good to me.
MM: Let's make sure we're talking about the same thing. We say "instance", there's potential for confusion. A module instance is a linked and initialized thing with an exports namespace object. Over here we're talking about the static concept. But the static concept is still an object in the language. That object is also an instance. I'm gonna try to call these static modules.
There is a subtle meaning to taking same source text and creating multiple source text over just reusing it. In a given import namespace, let's take dynamic import(). Right now it takes a specifier string and gives you a module instance. We're talking about enabling you to give it a static module and get back a module instance where the instance is linked and instantiated in the context of the import namespace. The constraint is within any one import namespace for any one specifier, there's only at most one module instance. I'd expect the same as static modules. But that's not a constraint on source text: the same source text on the file system can be linked to multiple specifiers, and can end up having multiple instances.
DE: Right, the question if we want multiple instances or not? It all fits within the scheme MM gives. I'm now concerned about evaluating giving fresh copies, because these are all held alive by the module map. So it feels a bigger granularity than function closures and object literals.
SUR: The concern is that it might be a performance footgun and I'm not sure what the cost here is. Syntactically I'm ignoring the footgun and I'm in favor of treating it like an object literal but I do hear the concern. Question is it something we need to railguard against?
SYG: What is the thing that's kept in the module map? The static module is not in the module map until it's instantiated, right?
DE: Yeah, but the idea is that you pass this to functions to be imported.
SUR: Yes.
SYG: I think that, if the use pattern comes to pass that--because they're not cached by source text, you would have to do some weak pointer caching, otherwise the onclick example would accumulate infinitely. If there's no actual way to collect it, that's a bigger issue.
DE: How bad is it to cache by source location?
SYG: We have to avoid the immortality of module blocks analogous to template strings.
MM: What is the worry?
SYG: Even if each module isn't so expensive, it will get imported repeatedly, creating more and more entries in the module map. Currently, the module map doesn't hold anything weakly.
DE: I thought we fixed the immortality issue by caching by source location rather than the contents.
MM: The meaning of a module expression has zero to do with the lexical context.
JW: Caching is not the problem, but rather creating new worker and new module graph each time you click the button. But, if you instantiate a new module graph every time, the only code that's executed is the code in the module block, because the dependencies in the module expression is already instantiated in the before-times, so you can reuse the older results. The dependencies won't be re-evaluated again. So, I think it might not too expensive.
DE: There's this react hooks case, as well, where you're not really trying to run anything new.
JW: In the hooks case, you're not capturing anything, so--never mind.
SYG: People's expectations now, if you could pass a function to a worker, would create a new closure each time. So, you may already want to lift it out, to avoid capturing things multiple times. So, why don't we think that people will lift this out, anyway? We could let the user solve the caching problem for us.
MM: A place where we've seen this come up is classes, where each time a class expression comes up, it has its own constructor with its own instances. What if we restricted module {} to be top-level only?
DE: I'm very skeptical of that. JS code gets put into all kinds of contexts. You can see this in the explainer. We find that to deploy code you need to allow it to be embedded in all these different contexts. I think flexibility is really necessary.
SUR: Agree. Would be a big price to pay if we don't know if it's a big problem yet. I see the concern but not yet convinced it's a serious risk.
KK: From the Compartments proposal's perspective. My preference would be to punt the issue of caching to the user. Because it would constrain our choice for cache keys for the Compartments proposal in particular. Currently Compartments use the result specifier of the static modules. It would be unfortunate to couple that cache key to source location at this stage.
DE: I don't think this would couple those. This would be more like the module block syntax is cached but cache key in general isn't the source location.
MM: If we're using the reified static module block object where we would use a specifier, then it's exactly as unique as a resolved specifier is.
KK: Concretely, you're proposing that the identity of the static module instance would effectively serve as the cache key. Is that right? (Yes) It could be made to work, but it's more complicated for someone implementing a Compartment, but that's not really a strong concern. As a JS developer I'd expect to be able to choose whether I want a module block to execute more than once based on where I declared the module.
SUR: That kinda aligns where what I was thinking. There are use cases where you want 10 different static modules in a loop even if they have the same contents, like where I was thinking for the actor model. If we were to cache by source then we wouldn't be able to do that.
JWK: Is this possible to add a modifier like "unique" that evaluates a unique module each time?
SUR: If we worry about the performance footgun we can explore the default of caching and let you opt out with a specifier.
DE: This has been a good discussion, I want to get to Worker constructor before closing. What do folks about using Worker constructors taking module blocks directly instead of via a data url?
JWK: It seems very useful but it seems like HTML doesn't like it.
DE: One person in HTML said we should do it, one person in HTML said we shouldn't. I'm wondering about your opinions.
JWK: I like the idea. That's the basic motivation of this proposal and would simplify creation of workers.
MM: This came up with Realms as well with using module blocks to send code to populate the new Realm. Seems attractive and it's nice for those to be parallel with Workers.
JWK: Creating a new Realm will have low performance cost, but Workers have high performance cost.
MM: True.
JWK: Is there an issue about the caching?
SUR: No issue yet.
DE: If you'd like to file an issue, that'd be useful.