-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module: ESM loaders next steps #36396
Comments
Combining load and transform into a single hook makes me very uncomfortable. It seems like this will result in transform hooks being silently skipped by load hooks if chained in the wrong order. |
Can you elaborate on this concern? It would be possible to have transform as a 2nd pass but it would create two competing ways to write a compiling loader which seems confusing. It would also mean that it's really hard to write a loader that can be 100% certain that the code they generate for a (potentially virtual/in-memory) module definitely runs without modifications. Because even if its load hook runs first, some other loader's transform hook may mess with the generated code. And it would require a careful dance to undo that with a transform hook. |
The example https loader has a code branch which does not call As far as having two ways to transform a module technically yes but IMO it would be incorrect to implement a transform with the load hook (a bug in that loader).
I might be misunderstanding on this point, I would expect all |
I think this might be a source of confusion. In that example, the HTTPS loader’s So if this were the only loader in use, e.g. When the CoffeeSript loader is added to the chain, its
Every loader hook has to return the expected value, whether a string URL for So And because the calls are explicit—the chaining happens via calls to |
My specific concern is what happens when someone (or something) runs:
This would cause import of This becomes more difficult to defend against when you consider
|
Should we convert this issue to a discussion? |
We could, but this is also an issue in that it has TODO items that pull requests should address (eventually to close the issue). I guess we could create a separate discussion related to this issue? My understanding (correct me if I’m wrong) is that once this becomes a discussion it can’t be converted back into an issue. |
If you were to refactor the above Keep in mind that the reason we’re moving the |
Could we allow Also sorry for my slow response, offline stuff has taken over most of my time lately. |
My concern with this is that now we have two patterns of chaining, with the Also are we running all the Basically, there are a lot of questions to be worked out when designing the API for how a separate |
Would it be better for node.js to throw a TypeError if a transform gave an undefined return, reference the URL of the offending hook in the message? Honestly how node.js handles undefined return isn't hugely important to me as long as it's documented. I'll never return undefined from a transform and someone else returning undefined from a transform will not cause my hook to be skipped silently. import CoffeeScript from 'coffeescript';
// CoffeeScript files end in .coffee, .litcoffee or .coffee.md
const extensionsRegex = /\.coffee$|\.litcoffee$|\.coffee\.md$/;
export function transform(previousResult, context) {
// The first check is technically not needed but ensures that
// we don’t try to compile things that already _are_ compiled.
if (previousResult.format === undefined && extensionsRegex.test(context.url)) {
// For simplicity, all CoffeeScript URLs are ES modules.
const format = 'module';
const source = CoffeeScript.compile(previousResult.source, { bare: true });
return {format, source};
}
// no action so return `previousResult` which came from
// the `load` chain / previous `transform` functions.
return previousResult;
}
I think it is. For someone writing a transform hook having that hook skipped is a bug. If I write a transform hook that gets silently skipped because of another hook that the end-user might not even be aware of this would make support more difficult. Keep in mind
Yes, the
I'm not strongly against this idea but I worry about loader hooks becoming stable without a separate transform hook. |
Okay, my current state of thinking it through: I think it's possible to have a separate
Adding
Type 1 hooks would have to added first. Type 2 and 3 hooks can be added interleaved but would have to be ordered within their categories. A JS-to-JS hook could be added before or after a "additional resource" hook but would have to be listed after any code instrumentation hooks and before all non-js-to-js hooks. In other words: Code instrumentation could still not be blindly added as the first hook, it would have to be added after any and all type 1 hooks. One consequence of splitting the precedence into these 3 groups is that type 1 hooks would be harder to write: In a system with just one pass, they could just return the exact source they require from
In other words: Writing a loader that ignores coverage instrumentation is still possible, just more awkward. And the sorting requirements are a little relaxed but really just for a single case: A hook for additional protocols or resource loading behaviors has a little more flexibility in where in the chain it's specified. |
These last few posts have made me more certain that we should complete the already-proposed work first before potentially adding a It seems to me that The appeal of |
This comment has been minimized.
This comment has been minimized.
I think so, yes. All the hooks in the current API are async, so presumably the future |
I had this report which the user is trying to use two experimental-loaders at once. It appears from the code that chaining doesn't happen from one loader to the next, and that only one or the other will end up getting used, depending on the order specified. https://github.com/d3x0r/JSON6/blob/master/lib/import.mjs |
Chaining has not yet been implemented. That’s what this issue is about. |
Hiya, I'm "the user" mentioned by d3x0r above. I hope you don't mind me chiming in with my 2¢: The current experimental implementation is pleasantly easy to understand and use. The "other" loader d3x0r mentioned facilitates import path aliases (ex I quite like @GeoffreyBooth's proposal to consolidate the hooks into the 2. I think it makes sense to consolidate them like that, but I do have a couple points: Simple things should (continue to) be easy. If we merely want to do a small subset of what the whole loader can do, that should be easily accomplished with minimal code. For instance, if a user has some module-like file whose extension for whatever reason isn't Requiring
I think a separate RE the sequence for |
@jshado1 this is a great proposal! Is this something you might be interested in collaborating on? |
@GeoffreyBooth yes! |
It would be great to have your input further then. Note one difference with a |
I would say in that situation the answer is merely: don't. I think |
So if you have a loader chain with a loader in it using |
I think either, expose that a different way or allow the subsequent ones to run but not change the end result (but probably not, because that could be confusing). I think injecting some monitor into the chain is not the way to go. I would expect that to be some kind of "finally" hook (that receives a dump of everything that happened) to ensure it happens after all is said and done. |
You should follow jestjs/jest#9430, that might be more directly related to your use case. The idea on Node’s side is to get ESM loaders and |
Ad hoc meeting to discuss custom loader hooksHello everyone, @JakobJingleheimer and I are interested in scheduling a call to discuss the hooks API in greater detail. There are many interested parties, and several attempts at changing the current experimental custom loader hooks API have failed for various reasons, so we want to make sure that everyone's use cases are heard out and addressed. If you have a use case that has not yet been voiced (or you simply want to ensure that it is accounted for), we will value your input on this call. We are hoping to be able to come to a conclusion to move #37468 forwards. Note: This call is mainly going to focus on the public API of the hooks system, so we'd like to know what hooks y'all need. When: Tuesday, April 6 @ 3:00 EST /cc everyone in this thread, @nodejs/modules, @bengl, et al. Please 👍 if you are interested in attending and I will send you a link that day |
We should be getting started in about 10 minutes. The link is below. Hope to see you all there! https://meet.google.com/jod-busy-frb @bengl @JakobJingleheimer @zackschuster @songkeys @GeoffreyBooth @d3x0r @giltayar @Qard @Flarna The meeting ran for about an hour and a half with additional dialogue scheduled for next week. |
For those interested in the continued dialog, today's meeting has been rescheduled for next week at the same time. We would like to have our conclusions from the last meeting reified in #37468, which is currently making good progress. |
Oh, missed it again...some proper calendar invites would help. 🤔 |
No problemo @Qard, hopefully a Google Calendar invitation will help everyone with adding it to their calendars. 📅 Event: Node.js Module Loaders Working Group Meeting 2021-04-23 If there is some other calendar invitation format you had in mind, please let me know. |
|
Hm, here is the whole calendar then. I've made it public, so it should work. |
Hey folks. I’ve been working on setting us up as a proper team akin to the Modules team. We now have a repo: http://github.com/nodejs/loaders. There’s also a group: @nodejs/loaders. Please respond with 🚀 if you’d like me to add you to this group, and by extension to the new Loaders team. You’ll start receiving GitHub notifications whenever that group tag is used. As for meetings, the way those work is that we need to use the Node Zoom account which records and streams to YouTube. I’ve asked for access to this, which won’t come until next week at the earliest. So I think we should probably postpone this week’s meeting (sorry). Also, no two Node teams’ meetings can overlap, because the Zoom/YouTube accounts can only stream one meeting at a time. The Friday 15:00 UTC meeting time overlaps with another team’s meeting, so we need to find a new regular time. Here’s the calendar of Node team meetings. There’s a gap two hours later, at 17:00 UTC / 1 pm ET / 10 am PT; I was thinking we could do this time every two weeks, starting on Friday April 30. If this works for you, please reply with 👍 , otherwise 👎 . If you can’t attend and you’d like to, please login to node-js.slack.com and discuss on the Thanks! |
Hey @nodejs/loaders, everyone who 🚀 ‘ed the previous message should be in @nodejs/loaders now. Please also take a look at https://github.com/nodejs/loaders. I’m still waiting on the Zoom credentials, so I guess we can’t meet tomorrow. I’ll propose a new time once I have the credentials, probably at the same time either a week or two weeks from tomorrow. Following the pattern of other Node teams, the meetings will be announced via issues in the team repo; you should be notified via the @nodejs/loaders tag and if you watch the repo. Issues and PRs on any |
@GeoffreyBooth if the issue is recording the meeting, could we use a google Meet or does it have to be zoom? I have an early adopter account/organisation for gSuite, so I get most of the paid features for free; I remember Meet switched meeting recording to a premium feature a while back—I think I still get it (I can check). Update: I can't find "recording" in settings (it must be enabled now apparently), so I'm not sure. "G Suite Legacy" is never listed in any of the support documentation, so I dunno if I'm supposed to have it (best I can find is a list of general features for G Suite Legacy). |
Hey @nodejs/loaders, I should have everything we need for our next meeting, which will be at 17:00 UTC / 1 pm ET / 10 am PT on May 14 (not tomorrow). nodejs/loaders#1 was a test of the meeting infrastructure, that the scripts that look at the Google calendar would generate the meeting agenda and so on properly, and everything appears to be working. Sorry for all the delays and confusion, and I look forward to seeing all of you next week! |
Closing this issue as discussion has moved to https://github.com/nodejs/loaders |
This issue is meant to be a tracking issue for where we as a team think we want ES module loaders to go. I’ll start it off by writing what I think the next steps are, and based on feedback in comments I’ll revise this top post accordingly.
I think the first priority is to finish the WIP PR that @jkrems started to slim down the main four loader hooks (
resolve
,getFormat
,getSource
,transformSource
) into two (resolveToURL
andloadFromURL
, or should they be calledresolve
andload
?). This would solve the issue discussed in #34144 / #34753.Next I’d like to add support for chained loaders. There was already a PR opened to achieve this, but as far as I can tell that PR doesn’t actually implement chaining as I understand it; it allows the
transformSource
hook to be chained but not the other hooks, if I understand it correctly, and therefore doesn’t really solve the user request.A while back I had a conversation with @jkrems to hash out a design for what we thought a chained loaders API should look like. Starting from a base where we assume #35524 has been merged in and therefore the only hooks are
resolve
andload
andgetGlobalPreloadCode
(which probably should be renamed to justglobalPreloadCode
, as there are no longer any other hooks namedget*
), we were thinking of changing the last argument of each hook fromdefault<hookName>
tonext
, wherenext
is the next registered function for that hook. Then we hashed out some examples for how each of the two primary hooks,resolve
andload
, would chain.Chaining
resolve
hooksSo for example say you had a chain of three loaders,
unpkg
,http-to-https
,cache-buster
:unpkg
loader resolves a specifierfoo
to an urlhttp://unpkg.com/foo
.http-to-https
loader rewrites that url tohttps://unpkg.com/foo
.cache-buster
that takes the url and adds a timestamp to the end, so likehttps://unpkg.com/foo?ts=1234567890
.These could be implemented as follows:
unpkg
loaderhttp-to-https
loadercache-buster
loaderThese chain “backwards” in the same way that function calls do, along the lines of
cacheBusterResolve(httpToHttpsResolve(unpkgResolve(nodeResolve(...))))
(though in this particular example, the position ofcache-buster
andhttp-to-https
can be swapped without affecting the result). The point though is that the hook functions nest: each one always just returns a string, like Node’sresolve
, and the chaining happens as a result of callingnext
; and if a hook doesn’t callnext
, the chain short-circuits. I’m not sure if it’s preferable for the API to benode --loader unpkg --loader http-to-https --loader cache-buster
or the reverse, but it would be easy to flip that if we get feedback that one way is more intuitive than the other.Chaining
load
hooksChaining
load
hooks would be similar toresolve
hooks, though slightly more complicated in that instead of returning a single string, eachload
hook returns an object{ format, source }
wheresource
is the loaded module’s source code/contents andformat
is the name of one of Node’s ESM loader’s “translators”:commonjs
,module
,builtin
(a Node internal module likefs
),json
(with--experimental-json-modules
) orwasm
(with--experimental-wasm-modules
).Currently, Node’s internal ESM loader throws an error on unknown file types:
import('file.javascript')
throws, even if the contents of that file are perfectly acceptable JavaScript. This error happens during Node’s internalresolve
when it encounters a file extension it doesn’t recognize; hence the current CoffeeScript loader example has lots of code to tell Node to allow CoffeeScript file extensions. We should move this validation check to be after the format is determined, which is one of the return values ofload
; so basically, it’s onload
to return aformat
that Node recognizes. Node’s internalload
doesn’t know to resolve a URL ending in.coffee
tomodule
, so Node would continue to error like it does now; but the CoffeeScript loader under this new design no longer needs to hook intoresolve
at all, since it can determine the format of CoffeeScript files withinload
. In code:coffeescript
loaderAnd the other example loader in the docs, to allow
import
ofhttps://
URLs, would similarly only need aload
hook:https
loaderIf these two loaders are used together, where the
coffeescript
loader’snext
is thehttps
loader’s hook andhttps
loader’snext
is Node’s native hook, so likecoffeeScriptLoad(httpsLoad(nodeLoad(...)))
, then for a URL likehttps://example.com/module.coffee
:https
loader would load the source over the network, but returnformat: undefined
, assuming the server supplied a correctContent-Type
header likeapplication/vnd.coffeescript
which ourhttps
loader doesn’t recognize.coffeescript
loader would get that{ source, format: undefined }
early on from its call tonext
, and setformat: 'module'
based on the.coffee
at the end of the URL. It would also transpile the source into JavaScript. It then returns{ format: 'module', source }
wheresource
is runnable JavaScript rather than the original CoffeeScript.Chaining
globalPreloadCode
hooksFor now, I think that this wouldn’t be chained the way
resolve
andload
would be. This hook would just be called sequentially for each registered loader, in the same order as the loaders themselves are registered. If this is insufficient, for example for instrumentation use cases, we can discuss and potentially change this to follow the chaining style ofload
.Next Steps
Based on the above, here are the next few PRs as I see them:
resolve
,load
andglobalPreloadCode
.resolve
andload
. Node’s internal loader already has no-ops fortransformSource
andgetGlobalPreloadCode
, so all this really entails is merging the internalgetFormat
andgetSource
into one functionload
.resolve
(on detection of unknown extensions) to withinload
(if the resolved extension has no defined translator).default<hookName>
becomesnext
and references the next registered hook in the chain.load
return value offormat: 'commonjs'
to work, or at least error informatively. See esm: Modify ESM Experimental Loader Hooks #34753 (comment).transform
hook (see below).This work should complete many of the major outstanding ES module feature requests, such as supporting transpilers, mocks and instrumentation. If there are other significant user stories that still wouldn’t be possible with the loaders design as described here, please let me know. cc @nodejs/modules
The text was updated successfully, but these errors were encountered: