Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: support ESM detection in the CJS loader #52047

Merged

Conversation

joyeecheung
Copy link
Member

@joyeecheung joyeecheung commented Mar 11, 2024

module: support ESM detection in the CJS loader

This patch:

  1. Adds ESM syntax detection to compileFunctionForCJSLoader()
    for --experimental-detect-module and allow it to emit the
    warning for how to load ESM when it's used to parse ESM as
    CJS but detection is not enabled.
  2. Moves the ESM detection of --experimental-detect-module for
    the entrypoint from executeUserEntryPoint() into
    Module.prototype._compile() and handle it directly in the
    CJS loader so that the errors thrown during compilation and
    execution
    during the loading of the entrypoint does not
    need to be bubbled all the way up. If the entrypoint doesn't
    parse as CJS, and detection is enabled, the CJS loader will
    re-load the entrypoint as ESM on the spot asynchronously using
    runEntryPointWithESMLoader() and cascadedLoader.import(). This
    is fine for the entrypoint because unlike require(ESM) we don't
    the namespace of the entrypoint synchronously, and can just
    ignore the returned value. In this case process.mainModule is
    reset to undefined as they are not available for ESM entrypoints.
  3. Supports --experimental-detect-module for require(esm)

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders
  • @nodejs/startup
  • @nodejs/vm

@joyeecheung joyeecheung marked this pull request as draft March 11, 2024 19:45
@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Mar 11, 2024
@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Mar 11, 2024

My note from #51977 (comment) still applies; I’m not sure why we need a separate PR for this. If we support detection at all, it should be via one flag, not two; having separate flags for detection for import and detection for require is a footgun for users because they should never be enabling one without the other.

@joyeecheung
Copy link
Member Author

joyeecheung commented Mar 13, 2024

Pasting my reply from #51977 (comment) :

I don't want to couple require(esm) with --experimental-detect-module because I think the implementation of the latter has a lot of issues that we talked about before (reading and parsing the file twice etc.) and it seems unfortunate to me that once you enable require(esm) detection you also get yourself into those troubles. But to fix --experimental-detect-module there are a lot to be done and would unnecessarily complicates require(esm), so might as well just keep detection out of the scope of this PR. I don't think it needs to cover all these to land as a first iteration experiment. Without detection this is already can load all the ~20 high impact esm-only npm packages that I've tested just fine. Detection only serves a less painful point that can be already be addressed with explicit module type while there's no workaround in core for require(esm) at all ATM. IMO the latter is the major pain point of ESM adoption while the former is minor and less of a priority, so I just want to focus on the higher priority.

I don't think require(esm) must support every feature there is (experimental ones, even) to land as an experiment itself. That would just procrastinate the entire thing. My criteria for whether the first iteration is feature complete enough is very simple - can require(esm) load say the top 20 high-impact ESM-only npm packages? If yes, then it's good enough as a MVP. We can sort out other stuff while it's experimental. If we want to enforce that experimental features must work with other features to land, then either SEA shouldn't land because it doesn't support import, or --experimental-detect-module shouldn't land because it doesn't support SEA, for example.

@GeoffreyBooth
Copy link
Member

I don’t think require(esm) must support every feature there is (experimental ones, even) to land as an experiment itself.

That’s fine; as I wrote over there, I don’t think the initial require(esm) needs to support detection in order to land. But this PR is just adding detection to require(esm), so I think it’s fair to say that this PR needs to support detection in a way that we agree on.

I also think detection can be part of the first PR, as it’s simple and all I requested was that you change the new flag you’re adding to instead use the existing flag. But if you’d prefer to do two PRs that’s fine too.

@joyeecheung
Copy link
Member Author

joyeecheung commented Mar 14, 2024

I think it’s fair to say that this PR needs to support detection in a way that we agree on.

I agree but I think if this would need a lot more than just what this PR already has to fix --experimental-detect-module. That's why I opened a separate PR to avoid derailing #51977 I personally don't think ContainsModuleSyntax should be implemented the way it is and we've talked about it (i.e. we should rely on V8 to cache successful compilations, reparse it as ESM on syntax failure potentially using V8 API instead of error message matching, etc.). But I would not expect to just fix that with a few lines, and it feels really out of scope of require(esm). For now I just want to prioritize require(esm) and set aside other things in the ESM loader that I think look off.

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Mar 14, 2024

I personally don’t think ContainsModuleSyntax should be implemented the way it is

Sure, but isn’t that a concern for unflagging it? Like those are just performance optimizations; you could add CommonJS detection to it and keep all detection behind the one flag, and then unflag everything once detection is refactoring to your liking.

@joyeecheung
Copy link
Member Author

joyeecheung commented Mar 14, 2024

Sure, but isn’t that a concern for unflagging it? Like those are just performance optimizations; you could add CommonJS detection to it and keep all detection behind the one flag, and then unflag everything once detection is refactoring to your liking.

I think the detection in CJS should be done in where CJS compiles code - that's where it can be done most efficiently. If done that way the refactoring isn't really a separate thing and can cause constant conflicts.

@joyeecheung
Copy link
Member Author

This needs #52413 to detect the ESM syntax properly in C++

@avivkeller avivkeller added module Issues and PRs related to the module subsystem. cli Issues and PRs related to the Node.js command line interface. labels Apr 21, 2024
@joyeecheung joyeecheung force-pushed the require-module-with-detection branch from 289a1cd to e559c26 Compare April 22, 2024 17:58
@joyeecheung joyeecheung changed the title [WIP] module: implement --experimental-require-module-with-detection module: support ESM detection in the CJS loader Apr 22, 2024
@joyeecheung joyeecheung added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 22, 2024
@joyeecheung joyeecheung marked this pull request as ready for review April 22, 2024 17:58
3. If X.json is a file, load X.json to a JavaScript Object. STOP
4. If X.node is a file, load X.node as binary addon. STOP
5. If X.mjs is a file, and `--experimental-require-module` is enabled,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that this is actually unnecessary, when X is already a file, it's covered by 1 (and even before we supported require(esm), 1 already implied that X.mjs can be loaded as a ECMAScript module by require(), kind of funny..

@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 22, 2024
@nodejs-github-bot

This comment was marked as outdated.

This patch:

1. Adds ESM syntax detection to compileFunctionForCJSLoader()
  for --experimental-detect-module and allow it to emit the
  warning for how to load ESM when it's used to parse ESM as
  CJS but detection is not enabled.
2. Moves the ESM detection of --experimental-detect-module for
  the entrypoint from executeUserEntryPoint() into
  Module.prototype._compile() and handle it directly in the
  CJS loader so that the errors thrown during compilation *and
  execution* during the loading of the entrypoint does not
  need to be bubbled all the way up. If the entrypoint doesn't
  parse as CJS, and detection is enabled, the CJS loader will
  re-load the entrypoint as ESM on the spot asynchronously using
  runEntryPointWithESMLoader() and cascadedLoader.import(). This
  is fine for the entrypoint because unlike require(ESM) we don't
  the namespace of the entrypoint synchronously, and can just
  ignore the returned value. In this case process.mainModule is
  reset to undefined as they are not available for ESM entrypoints.
3. Supports --experimental-detect-module for require(esm).
@joyeecheung joyeecheung force-pushed the require-module-with-detection branch from e559c26 to dac7020 Compare April 22, 2024 18:05
@joyeecheung joyeecheung added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 22, 2024
.eslintignore Outdated Show resolved Hide resolved
doc/api/modules.md Outdated Show resolved Hide resolved
src/node_contextify.cc Show resolved Hide resolved
src/node_contextify.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 22, 2024
@nodejs-github-bot
Copy link
Collaborator

@GeoffreyBooth GeoffreyBooth added the esm Issues and PRs related to the ECMAScript Modules implementation. label Apr 22, 2024
@nodejs-github-bot
Copy link
Collaborator

Copy link
Contributor

@aduh95 aduh95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GHA CI is failing though

Comment on lines +292 to +293
assert(!isMain);
CJSModule._load(filename, null, isMain);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If isMain is guaranteed to be falsy, is there any reason to pass it to CJSModule._load?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to pass it explicitly and rely on the implicit default value of CJSModule._load even though the value is readily available?

@richardlau
Copy link
Member

https://github.com/nodejs/node/actions/runs/8850127576/job/24303670039?pr=52047#step:5:4444

../src/node_contextify.cc: In function ‘void node::contextify::CompileFunctionForCJSLoader(const v8::FunctionCallbackInfo<v8::Value>&)’:
../src/node_contextify.cc:1510:8: error: unused variable ‘is_main’ [-Werror=unused-variable]
 1510 |   bool is_main = args[2].As<Boolean>()->Value();
      |        ^~~~~~~

@joyeecheung joyeecheung added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 29, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 29, 2024
Copy link
Contributor

Failed to start CI
   ⚠  Something was pushed to the Pull Request branch since the last approving review.
   ✘  Refusing to run CI on potentially unsafe PR
https://github.com/nodejs/node/actions/runs/8876920917

@github-actions github-actions bot added the request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. label Apr 29, 2024
@nodejs-github-bot
Copy link
Collaborator

@joyeecheung
Copy link
Member Author

I kept forgetting the request-ci label is unusable now :(

@nodejs-github-bot
Copy link
Collaborator

@GeoffreyBooth GeoffreyBooth added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. and removed request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. labels Apr 29, 2024
@joyeecheung joyeecheung added commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. labels Apr 29, 2024
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 29, 2024
@nodejs-github-bot nodejs-github-bot merged commit 4d59a9d into nodejs:main Apr 29, 2024
63 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 4d59a9d

aduh95 pushed a commit that referenced this pull request Apr 30, 2024
This patch:

1. Adds ESM syntax detection to compileFunctionForCJSLoader()
  for --experimental-detect-module and allow it to emit the
  warning for how to load ESM when it's used to parse ESM as
  CJS but detection is not enabled.
2. Moves the ESM detection of --experimental-detect-module for
  the entrypoint from executeUserEntryPoint() into
  Module.prototype._compile() and handle it directly in the
  CJS loader so that the errors thrown during compilation *and
  execution* during the loading of the entrypoint does not
  need to be bubbled all the way up. If the entrypoint doesn't
  parse as CJS, and detection is enabled, the CJS loader will
  re-load the entrypoint as ESM on the spot asynchronously using
  runEntryPointWithESMLoader() and cascadedLoader.import(). This
  is fine for the entrypoint because unlike require(ESM) we don't
  the namespace of the entrypoint synchronously, and can just
  ignore the returned value. In this case process.mainModule is
  reset to undefined as they are not available for ESM entrypoints.
3. Supports --experimental-detect-module for require(esm).

PR-URL: #52047
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Ch3nYuY pushed a commit to Ch3nYuY/node that referenced this pull request May 8, 2024
This patch:

1. Adds ESM syntax detection to compileFunctionForCJSLoader()
  for --experimental-detect-module and allow it to emit the
  warning for how to load ESM when it's used to parse ESM as
  CJS but detection is not enabled.
2. Moves the ESM detection of --experimental-detect-module for
  the entrypoint from executeUserEntryPoint() into
  Module.prototype._compile() and handle it directly in the
  CJS loader so that the errors thrown during compilation *and
  execution* during the loading of the entrypoint does not
  need to be bubbled all the way up. If the entrypoint doesn't
  parse as CJS, and detection is enabled, the CJS loader will
  re-load the entrypoint as ESM on the spot asynchronously using
  runEntryPointWithESMLoader() and cascadedLoader.import(). This
  is fine for the entrypoint because unlike require(ESM) we don't
  the namespace of the entrypoint synchronously, and can just
  ignore the returned value. In this case process.mainModule is
  reset to undefined as they are not available for ESM entrypoints.
3. Supports --experimental-detect-module for require(esm).

PR-URL: nodejs#52047
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
@marco-ippolito marco-ippolito added the backport-blocked-v20.x PRs that should land on the v20.x-staging branch but are blocked by another PR's pending backport. label May 21, 2024
EliphazBouye pushed a commit to EliphazBouye/node that referenced this pull request Jun 20, 2024
This patch:

1. Adds ESM syntax detection to compileFunctionForCJSLoader()
  for --experimental-detect-module and allow it to emit the
  warning for how to load ESM when it's used to parse ESM as
  CJS but detection is not enabled.
2. Moves the ESM detection of --experimental-detect-module for
  the entrypoint from executeUserEntryPoint() into
  Module.prototype._compile() and handle it directly in the
  CJS loader so that the errors thrown during compilation *and
  execution* during the loading of the entrypoint does not
  need to be bubbled all the way up. If the entrypoint doesn't
  parse as CJS, and detection is enabled, the CJS loader will
  re-load the entrypoint as ESM on the spot asynchronously using
  runEntryPointWithESMLoader() and cascadedLoader.import(). This
  is fine for the entrypoint because unlike require(ESM) we don't
  the namespace of the entrypoint synchronously, and can just
  ignore the returned value. In this case process.mainModule is
  reset to undefined as they are not available for ESM entrypoints.
3. Supports --experimental-detect-module for require(esm).

PR-URL: nodejs#52047
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
bmeck pushed a commit to bmeck/node that referenced this pull request Jun 22, 2024
This patch:

1. Adds ESM syntax detection to compileFunctionForCJSLoader()
  for --experimental-detect-module and allow it to emit the
  warning for how to load ESM when it's used to parse ESM as
  CJS but detection is not enabled.
2. Moves the ESM detection of --experimental-detect-module for
  the entrypoint from executeUserEntryPoint() into
  Module.prototype._compile() and handle it directly in the
  CJS loader so that the errors thrown during compilation *and
  execution* during the loading of the entrypoint does not
  need to be bubbled all the way up. If the entrypoint doesn't
  parse as CJS, and detection is enabled, the CJS loader will
  re-load the entrypoint as ESM on the spot asynchronously using
  runEntryPointWithESMLoader() and cascadedLoader.import(). This
  is fine for the entrypoint because unlike require(ESM) we don't
  the namespace of the entrypoint synchronously, and can just
  ignore the returned value. In this case process.mainModule is
  reset to undefined as they are not available for ESM entrypoints.
3. Supports --experimental-detect-module for require(esm).

PR-URL: nodejs#52047
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
@targos targos added dont-land-on-v18.x PRs that should not land on the v18.x-staging branch and should not be released in v18.x. dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. and removed backport-blocked-v20.x PRs that should land on the v20.x-staging branch but are blocked by another PR's pending backport. labels Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. cli Issues and PRs related to the Node.js command line interface. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. dont-land-on-v18.x PRs that should not land on the v18.x-staging branch and should not be released in v18.x. dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. esm Issues and PRs related to the ECMAScript Modules implementation. lib / src Issues and PRs related to general changes in the lib or src directory. module Issues and PRs related to the module subsystem. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants