Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webpack/bundler Fix: Update worker JS file import URL to be compatible with bundlers #22165

Merged

Conversation

Twinklebear
Copy link
Contributor

This PR closes #22140 by correcting the URL used to import the worker JS in library_pthread.js::allocateUnusedWorker. Previously the URL would be:

worker = new Worker(new URL(import.meta.url))

which webpack translates to thefile:/// path, causing the import to fail b/c the file:// path is not valid to access (webpack docs import meta). For example, from my own app that I hit this issue on:

wgpu_app.js:1050 Uncaught (in promise) DOMException: Failed to construct 'Worker': Script at 'file:///Users/will/repos/webgpu-cpp-wasm/web/src/cpp/wgpu_app.js' cannot be accessed from origin 'http://localhost:8080'.
    at Object.allocateUnusedWorker (http://localhost:8080/6009193530ba34a86bac.js:4590:14)
    at Object.initMainThread (http://localhost:8080/6009193530ba34a86bac.js:4487:15)
    at Object.init (http://localhost:8080/6009193530ba34a86bac.js:4481:15)
    at http://localhost:8080/6009193530ba34a86bac.js:12723:9
    at http://localhost:8080/6009193530ba34a86bac.js:982:81

The correct URL for bundlers is:

worker = new Worker(new URL('{{{ TARGET_JS_NAME }}}', import.meta.url));

Which webpack will understand and properly translate to the bundled path. Webpack will also automatically generate a new entry point for the worker file when bundling (docs).

I've updated the webpack ES6 test added in #22142 to exercise this code path by building w/ pthreads and adding -sPTHREAD_POOL_SIZE=1 to cause an unallocated worker to be created. Let me know if it's preferred to add a new separate webpack + es6 + threads test instead of just adding the threading flags on to the existing webpack es6 test.

Without the fix to the worker import URL added in this PR the test would fail with the error message below after enabling threads and starting one. After the change to the worker import URL, this test passes again.

test_webpack_es6 (test_browser.browser) ... Opening in existing browser session.
asset main.js 101 KiB [emitted] (name: main)
asset 505f5fe022ebe2b256f9.wasm 38.7 KiB [emitted] [immutable] [from: src/hello.wasm] (auxiliary name: main)
runtime modules 2.22 KiB 6 modules
cacheable modules 94.9 KiB (javascript) 38.7 KiB (asset)
  ./src/index.mjs 404 bytes [built] [code generated]
  ./src/hello.mjs 94.4 KiB [built] [code generated]
  ./src/hello.wasm 42 bytes (javascript) 38.7 KiB (asset) [built] [code generated]
webpack 5.92.1 compiled successfully in 96 ms
[test error (see below), automatically retrying]
Expected to find '/report_result?exit:0
' in '/report_result?exception:Failed to construct 'Worker': Script at 'file:///Users/will/repos/emscripten/out/test/webpack/src/hello.mjs' cannot be accessed from origin 'http://localhost:8888'. / Error: Failed to construct 'Worker': Script at 'file:///Users/will/repos/emscripten/out/test/webpack/src/hello.mjs' cannot be accessed from origin 'http://localhost:8888'.    at Object.allocateUnusedWorker (http://localhost:8888/webpack/dist/main.js:1625:18)    at Object.initMainThread (http://localhost:888[..]
', diff:

--- expected
+++ actual
@@ -1 +1 @@
-/report_result?exit:0
+/report_result?exception:Failed to construct 'Worker': Script at 'file:///Users/will/repos/emscripten/out/test/webpack/src/hello.mjs' cannot be accessed from origin 'http://localhost:8888'. / Error: Failed to construct 'Worker': Script at 'file:///Users/will/repos/emscripten/out/test/webpack/src/hello.mjs' cannot be accessed from origin 'http://localhost:8888'.    at Object.allocateUnusedWorker (http://localhost:8888/webpack/dist/main.js:1625:18)    at Object.initMainThread (http://localhost:88[..]


FAIL
[Browser harness server terminated]

After updating library_pthread.js::allocateUnusedWorker() this test passes again:

test_webpack_es6 (test_browser.browser) ... Opening in existing browser session.
asset main.js 101 KiB [emitted] (name: main)
asset src_hello_mjs.js 101 KiB [emitted]
asset 505f5fe022ebe2b256f9.wasm 38.7 KiB [emitted] [immutable] [from: src/hello.wasm] (auxiliary name: main)
runtime modules 4.59 KiB 14 modules
cacheable modules 94.9 KiB (javascript) 38.7 KiB (asset)
  ./src/index.mjs 404 bytes [built] [code generated]
  ./src/hello.mjs 94.5 KiB [built] [code generated]
  ./src/hello.wasm 42 bytes (javascript) 38.7 KiB (asset) [built] [code generated]

WARNING in [entry] [initial]
Circular dependency between chunks with runtime (main, src_hello_mjs)
This prevents using hashes of each other and should be avoided.

webpack 5.92.1 compiled with 1 warning in 91 ms
ok
[Browser harness server terminated]

----------------------------------------------------------------------
Ran 1 test in 0.996s

OK

I tried adding -pthread -sPTHREAD_POOL_SIZE=1 to the non-es6 webpack test path as well but this fails with an error about the document not being defined. So I've left the non-es6 test alone, and just the es6 path will test threads support. I tried just adding some if (document) here but still got the error about the document not being defined, so I wasn't sure quite what to change here to make this run with threads (or if that was even needed).

Expected to find '/report_result?exit:0
' in '/report_result?exception:Uncaught ReferenceError: document is not defined / undefined
', diff:

--- expected
+++ actual
@@ -1 +1 @@
-/report_result?exit:0
+/report_result?exception:Uncaught ReferenceError: document is not defined / undefined


FAIL
[Browser harness server terminated]

@Twinklebear Twinklebear force-pushed the will/fix-worker-es6-import-meta branch from 8938042 to f95b7b9 Compare June 29, 2024 21:13
Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for working on this!

I do think its a bit of a shame that we have to make the code strictly worse (IMHO) because the bundlers can't handle the shorter form. But I it is an important use case.

@RReverser do you think we can/should file this a bug on the webpack side.

@@ -5548,13 +5548,13 @@ def test_error_reporting(self):
def test_webpack(self, es6):
if es6:
shutil.copytree(test_file('webpack_es6'), 'webpack')
self.emcc_args += ['-sEXPORT_ES6']
self.emcc_args += ['-sEXPORT_ES6', '-pthread', '-sPTHREAD_POOL_SIZE=1']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be enough to instead use -sPROXY_TO_PTHREAD. That way not only do we ensure that a thread is started but we actually run some code (the main function) on that thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried -sPROXY_TO_PTHREAD initially but the test hangs instead of explicitly failing even though the same failure case occurs because the exception is caught by callMain:

return handleException(e);
and doesn't seem to bubble up to where it'd crash/fail at the top level of the test JS code. If I build the test with -sPTHREADS_DEBUG, I can see it tries to load the worker from file://, but the exception doesn't make it up to where the test harness can see it failed.

Is there a way I could fix that in the test case to see the exception?

As an semi-related aside, in my own apps I've been building w/ -sINVOKE_RUN=0 because callMain would gobble exceptions/crashes from Wasm making it hard to debug.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "would gobble exceptions/crashes from Wasm"? callMain should only deal with unwind/exit exceptions and should rethrow all other exceptions (i.e. exceptions that are not specifically supposed to be caught at the entry point).. Are you not seeing that behaviour? Or is it the rethrowning itself that is an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah exactly, when I step through I see that in handleException it checks for those unwind/exit exceptions and then goes to quit_ where it will rethrow the exception, but after that throw I just end up at this line: https://github.com/emscripten-core/emscripten/blob/main/test/webpack_es6/src/index.mjs#L16 where it prints loaded, and the exception seems to have disappeared? It's pretty strange, since the throw just seems to vanish after that point, when stepping through I don't end up at any other catch block/promise catch handling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stepping a bit more, I see that it does end up catching the error when calling instantiateAsync and rejecting the promise, but that promise rejection doesn't seem to make it back up the app instantiating the module, so index.mjs doesn't see the rejection

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that certainly does seem like something worth investigating more. Do you have a global unhandled rejection handler on your page? Is your expectation that the module promise would be rejected? I support that would be the most logical expectation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a global handler on my page, but the stepping through I'm describing here was actually just on running the test case: https://github.com/emscripten-core/emscripten/blob/main/test/webpack_es6/src/index.mjs . I think that the module promise being rejected if an error was thrown during callMain would be expected behavior, for example if the test code was like below, I'd expect the error to end up in the catch handler and print the error

import Module from './hello.mjs';
Module(params).then((instance) => {
  console.log('loaded');
}).catch((error) => {
  console.error(`Error: ${error}`);
})

The instantiateAsync call:

instantiateAsync(wasmBinary, wasmBinaryFile, info, receiveInstantiationResult).catch(readyPromiseReject);
has a catch handler that's fired after the exception is rethrown since that's the next spot that catches it, and it calls the promise reject function. However that promise rejection doesn't make it up to the top level of the app so the Module(params) promise is seen as successful instead of rejected (since .then is called instead)

src/library_pthread.js Outdated Show resolved Hide resolved
@sbc100
Copy link
Collaborator

sbc100 commented Jun 29, 2024

Can you update the title so that its clear this is webpack/bundler fix. To be clear there no issue with existing syntax other than bundlers seem to be confused by it.

Also, could you add a comment in the code saying that this is why need to use the longer form?

@Twinklebear Twinklebear changed the title Fix URL created for worker JS file import and change test to test it webpack/bundler Fix: Update worker JS file import URL to be compatible with bundlers Jun 29, 2024
@Twinklebear
Copy link
Contributor Author

Sounds good, I've updated the title and added a note explaining why we use this worse form of creating the file URL

@andreamancuso
Copy link

andreamancuso commented Jun 30, 2024

I'm also impacted by this - looking forward to this PR getting merged.

@sbc100 sbc100 merged commit a7b833c into emscripten-core:main Jul 1, 2024
28 checks passed
@RReverser
Copy link
Collaborator

do you think we can/should file this a bug on the webpack side.

I don't think so, that's a specific pattern recognised by all bundlers.

I'm also surprised, I was pretty sure we already emitted a coorect form back when I added it and even had tests to ensure it's this specific pattern that gets emitted in the output? Did that change at some point?

@sbc100
Copy link
Collaborator

sbc100 commented Jul 9, 2024

Yes, I think it changed with the removal of worker.js file in #21701

@sbc100
Copy link
Collaborator

sbc100 commented Jul 9, 2024

Yes, I think it changed with the removal of worker.js file in #21701

What happened in this change is that we went from creating a worker using a filename that was different to the current file (i.e. worker.js) to creating a worker using the exact same filename... so it made sense to me that we could transition to using the shorter/simpler form of new Worker that just uses import.meta.url. Its really shame we cannot use that form since the explicit name is more fragile.

@RReverser
Copy link
Collaborator

Ah I see. Yeah I get the distaste for an explicit name, but to be fair that's already the case for other assets like Wasm itself, so doesn't seem too problematic on its own. One way or another you need to do find-replace when renaming those files.

@fs-eire
Copy link
Contributor

fs-eire commented Jul 19, 2024

This change breaks my code which try to put everything into a single file.

Please consider this example:

// main.js. Will build into bundle `my-output.js`
// Emscripten generates `a.wasm` and `a.js`.

import wasmFactory from './a.js';

export const myFunc = async (param) => {
  const instance = await wasmFactory();
  instance.func(param);
};

Without this change, it is able to build the code above into a single file my-output.js, which will create PThread worker threads (import.meta.url will resolve to the path to my-output.js). However now this is impossible because a hard-coded file name is written in a.js.

I understand the problem that may affect webpack users, but this is a feature regression to me if I upgrade from 3.1.62 to 3.1.63. It would be great if a setting is added to allow reverting back to 3.1.62 behavior.

@sbc100

@fs-eire
Copy link
Contributor

fs-eire commented Jul 20, 2024

Thanks so much for working on this!

I do think its a bit of a shame that we have to make the code strictly worse (IMHO) because the bundlers can't handle the shorter form. But I it is an important use case.

@RReverser do you think we can/should file this a bug on the webpack side.

I agree with the "its a bit of a shame that we have to make the code strictly worse" part. For whatever reason, the current behavior on import.meta of webpack breaks JavaScript code which tries to evaluate script URL at runtime when using ESM. Some webpack users are aware of this issue and there is a workaround to disable the webpack's default behavior of rewriting import.meta.url by adding the following to webpack.config.js:

module : {parser : {javascript : {importMeta : false}}},

It was a very elegant solution to use new Worker(new URL(import.meta.url) because technically this allows an ESModule to be loaded correctly by both static import statement and dynamic import() when made into bundle.

@RReverser
Copy link
Collaborator

which will create PThread worker threads (import.meta.url will resolve to the path to my-output.js).

Hm I'd say you shouldn't be relying on import.meta.url resulting in pointing to any file except your own. That was UB to begin with.

@marcosc90
Copy link

This change broke vite builds with

error during build:
[commonjs--resolver] Circular worker imports detected. Vite does not support it

see: vitejs/vite#16103

@fs-eire
Copy link
Contributor

fs-eire commented Jul 23, 2024

which will create PThread worker threads (import.meta.url will resolve to the path to my-output.js).

Hm I'd say you shouldn't be relying on import.meta.url resulting in pointing to any file except your own. That was UB to begin with.

I expect import.meta.url to be the runtime URL of the current script, which is described in https://html.spec.whatwg.org/multipage/webappapis.html#hostgetimportmetaproperties. The example (my-output.js) is supposed to be generated by a build process that well defined in the project. I don't understand why you think there is any undefined behavior.

@fs-eire
Copy link
Contributor

fs-eire commented Jul 24, 2024

There are other places in the generated JS code which uses import.meta.url. For example, the following line:

  var _scriptName = import.meta.url;

This will be rewritten into a static file://-like string by webpack by default, so functions that uses _scriptName will not work correctly (locateFile()).

The real problem is, Webpack's default behavior does not allow runtime evaluation of import.meta.url. However, Emscripten generated JS may need the capability to get the current script URL at runtime for some feature.

@Twinklebear
Copy link
Contributor Author

Twinklebear commented Jul 24, 2024

This change broke vite builds with

error during build:
[commonjs--resolver] Circular worker imports detected. Vite does not support it

see: vitejs/vite#16103

While the build is broken here, I wonder if it's actually correct for it to be broken and that maybe Vite needs to support the main JS file referencing itself to make a Worker? Because Emscripten's new Worker call is a circular reference even before this change. The original URL:

worker = new Worker(new URL(import.meta.url))

and

worker = new Worker(new URL('{{{ TARGET_JS_NAME }}}', import.meta.url));

Technically do refer to the same file, so in both cases it's a circular reference. It's just that the bundler doesn't recognize/check the first case and lets it through. Vite's docs do mention the latter case that this PR added: https://vitejs.dev/guide/assets#new-url-url-import-meta-url . That PR mentions Vite supporting self referencing worker files, maybe it needs to extend support to allow a non-worker file to reference itself to create a worker?

@sbc100
Copy link
Collaborator

sbc100 commented Jul 31, 2024

This change breaks my code which try to put everything into a single file.

Please consider this example:

// main.js. Will build into bundle `my-output.js`
// Emscripten generates `a.wasm` and `a.js`.

import wasmFactory from './a.js';

export const myFunc = async (param) => {
  const instance = await wasmFactory();
  instance.func(param);
};

Without this change, it is able to build the code above into a single file my-output.js, which will create PThread worker threads (import.meta.url will resolve to the path to my-output.js). However now this is impossible because a hard-coded file name is written in a.js.

I understand the problem that may affect webpack users, but this is a feature regression to me if I upgrade from 3.1.62 to 3.1.63. It would be great if a setting is added to allow reverting back to 3.1.62 behavior.

@sbc100

I'm not sure that use case you describe will work since wouldn't that result in my-output.js being loaded on each pthread instead of just the a.js code (which is what is required).

e.g. if you added console.log to your main.js that should not get run on each worker startup right? But I think if you using mport.meta.url (a.k.a my-output.js) as your pthread startup file then it would be run on each thread, no?

@dwayneparton
Copy link

This change broke vite builds with

error during build:
[commonjs--resolver] Circular worker imports detected. Vite does not support it

see: vitejs/vite#16103

@brianpmaher and I have been trying to find a work around to support both vite and webpack for an internal library. We were able to bypass the circular deps error by replacing:

worker = new Worker(new URL('{{{ TARGET_JS_NAME }}}', import.meta.url));

with

var _workerUrl=new URL('{{{ TARGET_JS_NAME }}}', import.meta.url);
worker=new Worker(_workerUrl, workerOptions);

Maybe not the best solution but thought I'd share, hopefully that helps someone.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 2, 2024

@brianpmaher and I have been trying to find a work around to support both vite and webpack for an internal library. We were able to bypass the circular deps error by replacing:

Just to be clear, you are not actually removing the circularity here? (I think that is not actually possible). This somehow suppresses the warning but making the circularity obscure the algorithms checking for it?

@dwayneparton
Copy link

dwayneparton commented Aug 2, 2024

@sbc100 Correct. It's not removing it, more exploiting a hole in the algorithm vite uses to check.

A couple more details:
Here's what we were seeing

// does not work with vite but does with webpack
worker = new Worker(new URL('{{{ TARGET_JS_NAME }}}', import.meta.url));

// works with vite but not webpack
worker = new Worker(new URL(import.meta.url));

// works with vite and webpack
var _workerUrl=new URL('{{{ TARGET_JS_NAME }}}', import.meta.url);
worker=new Worker(_workerUrl, workerOptions);

@jamsinclair
Copy link

jamsinclair commented Aug 17, 2024

@dwayneparton Which version of vite are you using? From versions v5.2.2 simple self-referential circular dependencies can be handled when using module workers (vitejs/vite#16103).

The following config may need to be added to your vite.config.js file.

worker: {
    format: "es"
}

I think you would also need to update emscripten to use the flag -s EXPORT_ES6=1 so module workers are used (please correct me if I'm wrong here).

However, I have noticed in recent versions of emscripten the switch to storing workerOptions in a variable has broken Vite builds. I aim to create an issue and reproduction soon 😅 (Edit: the issue is #22394)

@dwayneparton
Copy link

@jamsinclair We're are using vite v5.3.1 with the worker.format = "es".

I think you would also need to update emscripten to use the flag -s EXPORT_ES6=1 so module workers are used (please correct me if I'm wrong here).

We also have the EXPORT_ES6 flag but maybe there is more there. @brianpmaher

The main issue we had was getting it to work in both Vite and Webpack w/o any treatment to the worker url.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URL path generated for Worker imports doesn't work with Webpack
8 participants