-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module: detect ESM syntax by trying to recompile as SourceTextModule #52413
module: detect ESM syntax by trying to recompile as SourceTextModule #52413
Conversation
Review requested:
|
This seems like it works. Can we think of any cases where a module would not use top-level await, import or export or import.meta syntax, and does not lexically define any of the CJS variables, where it would parse as ESM and not a script function body? I can't think of anything but it would be great if we could fully confirm the edge cases. |
I think no matter what edge cases there are, it's pretty unlikely that reparsing it with a weird async function wrapper (what the main branch currently does) can be any more correct than reparsing as SourceTextModule....I'd prefer to just go ahead with this to unblock #52047 |
There are two reasons for the current behavior:
I'm on my phone so I can't tell if these have been addressed, but if so, then sure let's change the algorithm (and update the docs accordingly). I think the first point has an existing test covering it, but if not then we should add one. |
That's not changed. The error getting rethrown from catch block in
That's not changed either. It is still only reparsed when the error is in the |
I don't think the algorithm as documented changes in anyway as it does not mention specifics about how the code is reparsed (would be somewhat funny to mention that async wrapper too). The doc only lists errors used to determine whether the code should be retried as ESM, those aren't changed by this PR. We are still using the same set to determine syntax errors that definitely indicate ESM syntax and don't need a reparse, and the same set to determine syntax errors that warrant a reparse to double-check. |
179dabd
to
dc9f175
Compare
const { enrichCJSError } = require('internal/modules/esm/translators'); | ||
enrichCJSError(error, source, resolvedMain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is removing these lines not causing any tests to break? There are tests that check the output of this “enriched” error that I’d think should be failing without these lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those only check part of the sentence because the rest keeps changing (typically it's not a good idea to check all of the message, either). The error is now "enriched" in C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I was talking about my followup. In this branch, this part is already just unreachable, even before this patch, because if !retryAsESM
, then the code doesn't contains ESM syntax, then the error won't be "enriched" by enrichCJSError()
anyway (i.e. skipping inside enrichCJSError()
already).
try_catch.ReThrow(); | ||
return; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this consolidation mean that we no longer need the try/catch at all? In other words, we could revert #52093 to go back to the previous approach of calling containsModuleSyntax
, and then running the CJS entry with the CJS loader, and the performance cost will be negligible because the parse gets cached in V8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what my follow up does https://github.com/joyeecheung/node/commits/esm-detect-cpp/ which also implements detection support for require(esm)
: if it's entry point, and the detection flag is enabled, and it looks like ESM, use the path that allow TLA, and ignore the exports because uh no one needs the module.exports
of the entry point..at least for something that's already ESM because that would not have been possible anyway? If it's not entry point, and the require flag is enabled, or detection flag is enabled and it looks like ESM, do synchronous require esm and add module.exports. Though I don't think it's a good idea to merge that into this PR, this is a refactoring, that one implement things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it’s entry point, and the detection flag is enabled, and it looks like ESM, use the path that allow TLA, and ignore the exports because uh no one needs the
module.exports
of the entry point
What if the entry point gets required or imported a second time? Like node entry.js
imports foo.js
which imports entry.js
? The second import would need the exports from entry.js
.
Can we land your follow-up and unflag detect-module
while require(esm)
is still flagged? Or is require(esm)
possibly close enough to being ready to unflag that we could just do them together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second import would need the exports from entry.js.
If entry.js is ESM, then that's the namespace, not module.exports
, and we'll return the namespace of the cached module in the ESM loader, because import entry.js
is not handled by CJS loader when the module job is already cached.
If it's require entry.js
again, then we can error or warn that it's the entry point, or return the namespace if it's fully evaluated. But that's talking about require(esm)
edge cases in a follow-up now, pretty off topic for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we land your follow-up and unflag detect-module while require(esm) is still flagged?
I don't know for sure, locally my followup does pass test/parallel/test-debugger-exceptions.js
even when detect-module is unflagged, though, if you think that's the last blocker for unflagging detection. require(esm)
will need to be flagged for at least 22, blockers are __esModule
interop and customization hooks support, and maybe there are more as we get more feedback from 22. In general I'm not too comfortable with just unflagging an experimental feature that really needs some battle testing from users at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I'm not too comfortable with just unflagging an experimental feature that really needs some battle testing from users at least.
Especially for a feature like this one. As soon as it can be used without a flag, people will start to publish packages that depend on it. Then we will be blamed if we break them.
1b571e1
to
6eec04e
Compare
// In case the entry point is a large file, such as a bundle, | ||
// ensure no further references can prevent it being garbage-collected. | ||
cjsLoader.entryPointSource = undefined; | ||
if (error != null && ObjectGetPrototypeOf(error) === SyntaxErrorPrototype) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential follow-up: Do we control the error being thrown? If so, I'm wondering if we should create a sub-class like MismatchedSyntaxError
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The follow up is just don't do the check in JS, and check right after parsing in C++ :)
Instead of using an async function wrapper, just try compiling code with unknown module format as SourceTextModule when it cannot be compiled as CJS and the error message indicates that it's worth a retry. If it can be parsed as SourceTextModule then it's considered ESM. Also, move shouldRetryAsESM() to C++ completely so that we can reuse it in the CJS module loader for require(esm). Drive-by: move methods that don't belong to ContextifyContext out as static methods and move GetHostDefinedOptions to ModuleWrap.
425d8ee
to
14afc5a
Compare
Landed in 651fa04 |
Instead of using an async function wrapper, just try compiling code with unknown module format as SourceTextModule when it cannot be compiled as CJS and the error message indicates that it's worth a retry. If it can be parsed as SourceTextModule then it's considered ESM. Also, move shouldRetryAsESM() to C++ completely so that we can reuse it in the CJS module loader for require(esm). Drive-by: move methods that don't belong to ContextifyContext out as static methods and move GetHostDefinedOptions to ModuleWrap. PR-URL: #52413 Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com> Reviewed-By: Jacob Smith <jacob@frende.me>
* chore: bump Node.js to v22.9.0 * build: drop base64 dep in GN build nodejs/node#52856 * build,tools: make addons tests work with GN nodejs/node#50737 * fs: add fast api for InternalModuleStat nodejs/node#51344 * src: move package_json_reader cache to c++ nodejs/node#50322 * crypto: disable PKCS#1 padding for privateDecrypt nodejs-private/node-private#525 * src: move more crypto code to ncrypto nodejs/node#54320 * crypto: ensure valid point on elliptic curve in SubtleCrypto.importKey nodejs/node#50234 * src: shift more crypto impl details to ncrypto nodejs/node#54028 * src: switch crypto APIs to use Maybe<void> nodejs/node#54775 * crypto: remove DEFAULT_ENCODING nodejs/node#47182 * deps: update libuv to 1.47.0 nodejs/node#50650 * build: fix conflict gyp configs nodejs/node#53605 * lib,src: drop --experimental-network-imports nodejs/node#53822 * esm: align sync and async load implementations nodejs/node#49152 * esm: remove unnecessary toNamespacedPath calls nodejs/node#53656 * module: detect ESM syntax by trying to recompile as SourceTextModule nodejs/node#52413 * test: adapt debugger tests to V8 11.4 nodejs/node#49639 * lib: update usage of always on Atomics API nodejs/node#49639 * test: adapt test-fs-write to V8 internal changes nodejs/node#49639 * test: adapt to new V8 trusted memory spaces nodejs/node#50115 * deps: update libuv to 1.47.0 nodejs/node#50650 * src: use non-deprecated v8::Uint8Array::kMaxLength nodejs/node#50115 * src: update default V8 platform to override functions with location nodejs/node#51362 * src: add missing TryCatch nodejs/node#51362 * lib,test: handle new Iterator global nodejs/node#51362 * src: use non-deprecated version of CreateSyntheticModule nodejs/node#50115 * src: remove calls to recently deprecated V8 APIs nodejs/node#52996 * src: use new V8 API to define stream accessor nodejs/node#53084 * src: do not use deprecated V8 API nodejs/node#53084 * src: do not use soon-to-be-deprecated V8 API nodejs/node#53174 * src: migrate to new V8 interceptors API nodejs/node#52745 * src: use supported API to get stalled TLA messages nodejs/node#51362 * module: print location of unsettled top-level await in entry points nodejs/node#51999 * test: make snapshot comparison more flexible nodejs/node#54375 * test: do not set concurrency on parallelized runs nodejs/node#52177 * src: move FromNamespacedPath to path.cc nodejs/node#53540 * test: adapt to new V8 trusted memory spaces nodejs/node#50115 * build: add option to enable clang-cl on Windows nodejs/node#52870 * chore: fixup patch indices * chore: add/remove changed files * esm: drop support for import assertions nodejs/node#54890 * build: compile with C++20 support nodejs/node#52838 * deps: update nghttp2 to 1.62.1 nodejs/node#52966 * src: parse inspector profiles with simdjson nodejs/node#51783 * build: add GN build files nodejs/node#47637 * deps,lib,src: add experimental web storage nodejs/node#52435 * build: add missing BoringSSL dep * src: rewrite task runner in c++ nodejs/node#52609 * fixup! build: add GN build files * src: stop using deprecated fields of v8::FastApiCallbackOptions nodejs/node#54077 * fix: shadow variable * build: add back incorrectly removed SetAccessor patch * fixup! fixup! build: add GN build files * crypto: fix integer comparison in crypto for BoringSSL * src,lib: reducing C++ calls of esm legacy main resolve nodejs/node#48325 * src: move more crypto_dh.cc code to ncrypto nodejs/node#54459 * chore: fixup GN files for previous commit * src: move more crypto code to ncrypto nodejs/node#54320 * Fixup Perfetto ifdef guards * fix: missing electron_natives dep * fix: node_use_node_platform = false * fix: include src/node_snapshot_stub.cc in libnode * 5507047: [import-attributes] Remove support for import assertions https://chromium-review.googlesource.com/c/v8/v8/+/5507047 * fix: restore v8-sandbox.h in filenames.json * fix: re-add original-fs generation logic * fix: ngtcp2 openssl dep * test: try removing NAPI_VERSION undef * chore(deps): bump @types/node * src: move more crypto_dh.cc code to ncrypto nodejs/node#54459 * esm: remove unnecessary toNamespacedPath calls nodejs/node#53656 * buffer: fix out of range for toString nodejs/node#54553 * lib: rewrite AsyncLocalStorage without async_hooks nodejs/node#48528 * module: print amount of load time of a cjs module nodejs/node#52213 * test: skip reproducible snapshot test on 32-bit nodejs/node#53592 * fixup! src: move more crypto_dh.cc code to ncrypto * test: adjust emittedUntil return type * chore: remove redundant wpt streams patch * fixup! chore(deps): bump @types/node * fix: gn executable name on Windows * fix: build on Windows * fix: rename conflicting win32 symbols in //third_party/sqlite On Windows otherwise we get: lld-link: error: duplicate symbol: sqlite3_win32_write_debug >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:47987 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj lld-link: error: duplicate symbol: sqlite3_win32_sleep >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:48042 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj lld-link: error: duplicate symbol: sqlite3_win32_is_nt >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:48113 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj lld-link: error: duplicate symbol: sqlite3_win32_utf8_to_unicode >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:48470 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj lld-link: error: duplicate symbol: sqlite3_win32_unicode_to_utf8 >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:48486 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj lld-link: error: duplicate symbol: sqlite3_win32_mbcs_to_utf8 >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:48502 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj lld-link: error: duplicate symbol: sqlite3_win32_mbcs_to_utf8_v2 >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:48518 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj lld-link: error: duplicate symbol: sqlite3_win32_utf8_to_mbcs >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:48534 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj lld-link: error: duplicate symbol: sqlite3_win32_utf8_to_mbcs_v2 >>> defined at .\..\..\third_party\electron_node\deps\sqlite\sqlite3.c:48550 >>> obj/third_party/electron_node/deps/sqlite/sqlite/sqlite3.obj >>> defined at obj/third_party/sqlite\chromium_sqlite3/sqlite3_shim.obj * docs: remove unnecessary ts-expect-error after types bump * src: move package resolver to c++ nodejs/node#50322 * build: set ASAN detect_container_overflow=0 nodejs/node#55584 * chore: fixup rebase * test: disable failing ASAN test * win: almost fix race detecting ESRCH in uv_kill libuv/libuv#4341
Instead of using an async function wrapper, just try compiling code with unknown module format as SourceTextModule when it cannot be compiled as CJS and the error message indicates that it's worth a retry. If it can be parsed as SourceTextModule then it's considered ESM.
Also, move shouldRetryAsESM() to C++ completely so that we can reuse it in the CJS module loader for require(esm).
Drive-by: move methods that don't belong to ContextifyContext out as static methods and move GetHostDefinedOptions to ModuleWrap.