Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement shim_url #3247

Closed
wants to merge 2 commits into from
Closed

Implement shim_url #3247

wants to merge 2 commits into from

Conversation

daxpedda
Copy link
Collaborator

Adjusted #3032 according to @Liamolucko's comment: #3032 (comment).

This is now only implemented for the web and no-modules targets, all other's return None.

Replaces #3032.

@daxpedda
Copy link
Collaborator Author

Similarly to #3168 should I return a Result instead of an Option?

@daxpedda daxpedda mentioned this pull request Jan 30, 2023
@daxpedda daxpedda force-pushed the shim-url branch 3 times, most recently from be1385b to e1f7641 Compare February 2, 2023 12:51
@daxpedda
Copy link
Collaborator Author

daxpedda commented Feb 2, 2023

Rebased after #3272 and #3279.

@Liamolucko friendly ping.

@daxpedda
Copy link
Collaborator Author

Rebased after #3329.

src/lib.rs Outdated Show resolved Hide resolved
@lukaslihotzki
Copy link
Contributor

I think the better way to implement this are shim format-specific getters. For example, es_module_shim_url(), instead of shim_format() and shim_url().
The main reason is that supporting multiple shim formats are a maintenance burden for other crates. Therefore, other crates may decide to only support a single format and tell their users to simply use that. This could lead to conflicts when a user wants to use multiple independent crates as dependencies. Even if crates want to support multiple shim formats, they need rather complex tests and CI to prevent breakage of lesser used formats. Shim format-specific getters avoids this complexity and help to create a compatible, healthy ecosystem around wasm-bindgen.
Also, some formats may be unsuitable for some purposes. For example, can the shim of the bundler target, that statically imports the Wasm module, be used with an existing memory instance? What about future targets that may not use JS shims at all? Adding an ES module shim that provides the existing interface allows new targets without breaking existing crates.

es_module_shim_url() would return import.meta.url on targets that use an ES shim natively. Otherwise, an extra ES shim would be included like a linked module. This ES shim is able to implement to use import.meta.url again (even with --split-linked-modules unset, import.meta.url will return a data URL that works fine).

More format-specific getters could be added for other formats (like nomodules_shim_url()), but I don't think this is needed. ES modules are the standardized feature to compose JS code, that works on browsers and servers. (Firefox 111 now supports ES modules in workers.) Also, their location can be discovered reliably with import.meta.url, so es_module_shim_url() prevents breakage of linked modules when used in Web Workers. On the other hand, is there any case where a nomodules shim is preferred?

@daxpedda
Copy link
Collaborator Author

Adding an ES module shim that provides the existing interface allows new targets without breaking existing crates.

Isn't this solved by using #[non_exhaustive]? Or is there something incompatible with shim_url and it returning a String?

(Firefox 111 now supports ES modules in workers.) ...
On the other hand, is there any case where a nomodules shim is preferred?

See #3355, but generally I'm in favor of deprecating when we reach that point.

Co-authored-by: Liam Murphy <liampm32@gmail.com>
@daxpedda daxpedda requested a review from Liamolucko March 14, 2023 23:57
@lukaslihotzki
Copy link
Contributor

Isn't this solved by using #[non_exhaustive]? Or is there something incompatible with shim_url and it returning a String?

The problem is that existing crates (without updates) will compile with #[non_exhaustive], but they are unable to correctly use the shim if they don't understand the format. This is the same problem like when they get the UnsupportedTargetError. This can be prevented by supporting multiple shims, so the shim with the interface expected by the caller can be returned.

@daxpedda
Copy link
Collaborator Author

daxpedda commented Mar 15, 2023

That's a really good point actually. I would like to suggest to merge shim_url and shim_format in one function instead of introducing multiple functions for each shim. @Liamolucko WDYT?

(CI got stuck btw)

@Liamolucko
Copy link
Collaborator

That's a really good point actually. I would like to suggest to merge shim_url and shim_format in one function instead of introducing multiple functions for each shim. @Liamolucko WDYT?

That sounds great, assuming I understood you properly!

To be clear, do you mean something like:

fn shim_url(format: ShimFormat) -> String { ... }

Which then returns the URL of a shim with the format you passed in?

Just double-checking because what you're describing also sounds kinda like:

fn shim() -> (String, ShimFormat) { ... }

Which I wouldn't be particularly in favour of.

@daxpedda
Copy link
Collaborator Author

I was actually suggesting:

fn shim() -> Shim { ... }

enum Shim {
	EsModule { url: String },
	NoModules { url: Option<String>, global_name: String },
	...
}

// or maybe:
struct Shim {
	url: Option<String>,
	format: ShimFormat,
}

But your suggestion with passing in the format might be better if we want to support getting multiple shims? Or we could return a Vec?

Personally, if we go the route of wanting to support multiple shims I would prefer returning a Vec, making multiple calls to find the available shim you are looking for sounds a bit awful.

@Liamolucko
Copy link
Collaborator

Ah, I see, you were talking about @lukaslihotzki's comment that crates would break when they didn't understand the shim format. I thought you were talking about their suggestion to support multiple shims.

I don't think that combining shim_url and shim_format addresses the problem as I understood it. It stops crates from mistakenly ignoring shim_format and assuming that the shim is a particular format, but I think the problem @lukaslihotzki was getting at was that if you want to use a crate that relies on shim_url, you simply can’t if you’re using a target without a shim or a new shim format which the crate hasn’t added support for yet.

Supporting multiple shims solves that, since the crate can pick whatever shim it likes, and doesn't have to worry about what shim the current target is using, if it even has one.

I do think that moving to an enum like that is an improvement if we stick to the one-shim approach, but I'm now leaning more towards having multiple shims here as well as in linked modules. @lukaslihotzki makes a good point that it makes it a lot easier for crates to be target-agnostic, and makes it possible for this feature to work on —-target bundler at all.

As for how that would actually work, the version of shim_url I semi-proposed wouldn't be querying a list of existing shims, it would be generating the shim, in a similar fashion to link_to! generating linked modules. I think it would actually have to be a macro to do that, not a function, but at the time I was just trying to check what you were saying and didn't stop to think 'wait, how?' Or it could be multiple functions like @lukaslihotzki is proposing.

So, there wouldn't be a problem of needing multiple calls to 'find' a shim like you were describing.

@daxpedda
Copy link
Collaborator Author

Thanks, I see what I misunderstood here.

I'm starting to question the multiple shims approach. The advantage of having multiple shims is that a library maintainer doesn't have to think about what shim the user currently uses, but gets whatever they desire. This actually comes with an overhead though, multiple shims have to be bundled into the application. But of course supporting each shim separately comes with a higher maintenance burden for a library maintainer.

Not that I'm against the idea of multiple shims, but I think it's a choice users should be able to make. For example I don't use a bundler and I don't have any snippets either. I have a single JS file that I can minimize and serve and that's it, I don't want to buy into the overhead the bundler target brings with it.

Also I'm not sure how this is supposed to work with the no-modules target, when I use the no-modules target I do it for compatibility purposes, an underlying library can't just use the web target, that would break my build, I want it to use what I use, if it doesn't support it then I have to fix the library or get another. Even if we deprecate the no-modules target, not all future targets will be compatible with each other.

So the conclusion I'm coming to while I'm writing this is that it's an interesting idea that we could support importing any desired shim, but my use-case still requires the current approach of requesting the currently used shim and act appropriately.


I would also like to understand exactly how the bundler target works, I mentioned before that I have no clue of the JS ecosystem, I don't know what Webpack is and I've never used a bundler before. Taking a brief look at this stuff, it seems quite straightforward to me, they bundle all the files together, probably doing minimization at the same point too.

My understanding is that the bundler target doesn't work without actually using a bundler on the generated files, so how exactly would that work if a user uses the web target but a library pulls in the bundler target shim? Apologies for the naivety on this topic.

@lukaslihotzki
Copy link
Contributor

This actually comes with an overhead though, multiple shims have to be bundled into the application.

The overhead should be small, because shims are usually much smaller than the Wasm module. For wasm-audio-worklet (very small app), the shim size is 17% of the Wasm size (both compressed). In a more interesting project that uses winit and wgpu, the shim is only 1.5% of the Wasm. Also, the size of extra shims can be optimized by sharing code between them. ES module shims (like bundler and web) can import each other. no-modules shims cannot import, but they can be imported. Such optimizations can be implemented transparently at any time in wasm-bindgen.

Not that I'm against the idea of multiple shims, but I think it's a choice users should be able to make. For example I don't use a bundler and I don't have any snippets either. I have a single JS file that I can minimize and serve and that's it, I don't want to buy into the overhead the bundler target brings with it.

There is no overhead if you request the shim url of the native format of your target. If your manual build setup does not support linked modules, omit --split-linked-modules. wasm-bindgen could return the requested shim as data URL, possibly with code sharing where possible, if that is implemented. A larger shim is better than a broken product, and nothing prevents you from requesting the native format for optimal size.

Also I'm not sure how this is supposed to work with the no-modules target, when I use the no-modules target I do it for compatibility purposes, an underlying library can't just use the web target.

If a library creates an audio worklet and uses the ES module shim in there, that should work fine even if --target no-modules is set, for example. Could you explain in which case it wouldn't work?

I want it to use what I use, if it doesn't support it then I have to fix the library or get another.

You can still change the library or choose another to match your native format. The difference with multiple shims is that users aren't forced to do it, if they accept the usually very low overhead. Also they can still do it at later time when their product works, but the shim size needs to be optimized.

@daxpedda
Copy link
Collaborator Author

The overhead should be small, because shims are usually much smaller than the Wasm module. For wasm-audio-worklet (very small app), the shim size is 17% of the Wasm size (both compressed). In a more interesting project that uses winit and wgpu, the shim is only 1.5% of the Wasm.

I don't disagree that the overhead is small, I would still prefer not to pay for it.

Also, the size of extra shims can be optimized by sharing code between them. ES module shims (like bundler and web) can import each other. no-modules shims cannot import, but they can be imported. Such optimizations can be implemented transparently at any time in wasm-bindgen.

That would be really nice but I'm concerned with this idea of "future optimization", wasm-bindgen doesn't have the maintenance capacity for this at all, imho.

In any case, I agree that whatever overhead I was talking about is negligible and I'm happy to accept it for a better feature.

There is no overhead if you request the shim url of the native format of your target. If your manual build setup does not support linked modules, omit --split-linked-modules. wasm-bindgen could return the requested shim as data URL, possibly with code sharing where possible, if that is implemented. A larger shim is better than a broken product, and nothing prevents you from requesting the native format for optimal size.

I agree, but that requires an API that actually can request the native format, the API we were discussing above might suggest that we can't:

fn shim_url(format: ShimFormat) -> String { ... }

If a library creates an audio worklet and uses the ES module shim in there, that should work fine even if --target no-modules is set, for example. Could you explain in which case it wouldn't work?

Audio worklets support both, but web workers don't, on Firefox they don't support ES modules (the web target).

But that is not the whole story, yes, Firefox will support it eventually, and then we can drop this point too. But what about future targets? Let's say the component model is introduced, we will need a new target for that, maybe that will only be supported on some browsers, or maybe only in the window but not in workers, or any other scenario.

This idea of making multiple shims available on demand only works if all targets are actually compatible with each other, which they are definitely not right now and we can't predict that new targets will be in the future.

You can still change the library or choose another to match your native format. The difference with multiple shims is that users aren't forced to do it, if they accept the usually very low overhead. Also they can still do it at later time when their product works, but the shim size needs to be optimized.

I agree. I want to emphasize I'm not against the idea of having multiple shims, but it doesn't work for all use-cases and I want to make sure we support at least the native approach!

@lukaslihotzki
Copy link
Contributor

lukaslihotzki commented Mar 16, 2023

Audio worklets support both, but web workers don't, on Firefox they don't support ES modules (the web target).

No, audio worklets do not support importScript. They only support ES import by design (not by missing browser implementation). Your point is exactly the reason why we do need multiple shims. So a web worker crate that wants to support Firefox can request the no-modules shim, while an audio crate can request the ES module shim, and everything works. The same reasoning applies to future targets. One reason why I even proposed multiple shims was to allow future usage of the Component Model without breaking existing JS bridges in crates.
Edit: I noticed you could probably ES import the no-modules shim, but having an actual ES module is much cleaner and illustrates how multiple shims increase compatibility instead of decreasing it (if you imagine importing the no-modules shim would be impossible for some reason).

I agree, but that requires an API that actually can request the native format

If you are the application developer, then you should know your target and choose the appropriate format. If you develop an independent library, you have to choose a format suitable for most of your users and suggest to use a corresponding target to your library users. The library users that choose a different format regardless will prefer a larger product compared to a broken product.

If a library developer wants to provide different native formats, we could add a macro like any_shim!(EsModule, NoModules) that would return one of the specified formats (preferring native), but I think app and library developers will happily accept the overhead if that helps to reduce complexity.

@daxpedda
Copy link
Collaborator Author

Okay I'm starting to see the issue here.

I was under the assumption that the proposal is to expose multiple shims to make it irrelevant for library developers which shim they pick compared to which shim the user picked.
But the suggestion was actually to let library developer be able to pick the shim they need, the one that actually works for whatever is being done despite the target the user picks.

Now the feature actually makes sense to me 😅!


No, audio worklets do not support importScript.

You can still use audio worklets on Firefox with the no-modules target by not using importScript(), but by just getting the shim and pasting it inside the module used in addModule(). But web workers on Firefox just don't support ES modules in any way, unless you transform them to classic JS course.

But absolutely, multiple shims would solve that problem much better, now that I understand the proposal it makes perfect sense.

If you are the application developer, then you should know your target and choose the appropriate format. If you develop an independent library, you have to choose a format suitable for most of your users and suggest to use a corresponding target to your library users. The library users that choose a different format regardless will prefer a larger product compared to a broken product.

If a library developer wants to provide different native formats, we could add a macro like any_shim!(EsModule, NoModules) that would return one of the specified formats (preferring native), but I think app and library developers will happily accept the overhead if that helps to reduce complexity.

I agree that this is how it will likely play out anyway.


So, after (hopefully) finally understanding the full picture here, I'm also strongly in favor of supporting multiple shims.
As far as I can tell I also agree with @Liamolucko that this will probably be a macro, so I'm gonna skip on discussing the APIs we suggested above and I guess we will discuss this again when we have some implementation to look at or in #3034.

About #3168 and #3247, I have some concerns and arguments why we should still merge them, more or less as-is.

I'm gonna start with that if we still want to support querying native modules in a macro, like any_shim!(EsModule, NoModules) (by @lukaslihotzki), we will still need something like shim_url and shim_format. But it could be argued they should be hidden as an implementation detail.

I am also concerned about the timeline here, this will take a long time to develop and to get right. Comparatively shim_url and shim_format are small and not much can go wrong here.

The main argument against it, which would be that we encourage libraries to depend on these features which could be harmful in the long run, is something that I now understand. This wouldn't be a problem if we could remove them again in a version bump, but I can only assume that this isn't going to happen anytime soon. But I also don't think it's a huge deal and I'm not sure how harmful this can really be. Wasm and browsers are a moving target and libraries have to update and adjust anyway.

I do still think though that combining shim_url and shim_format could alleviate at least some of the concerns though. At least nobody is gonna just use shim_url without thinking about the format.

@daxpedda daxpedda added the needs discussion Requires further discussion label May 11, 2023
@BlinkyStitt
Copy link

What's needed to move this forward? I'm wanting to use an AudioWorkletProcessor but it's not as easy as I was expecting. This seems like it will remove some of the boilerplate that I need.

@daxpedda
Copy link
Collaborator Author

#[wasm_bindgen]
extern "C" {
	type Meta;
	#[wasm_bindgen(js_namespace = import, js_name = meta)]
	static META: Meta;
	#[wasm_bindgen(method, getter)]
	fn url(this: &Meta) -> String;
}

let shim_url: String = META.url();

Is what you can use now, though it won't work with the no-modules target.

I will close this PR as its no longer planned. Not only do we plan to #3355 but now all browsers support ES modules in audio worklets.

@daxpedda daxpedda closed this Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion Requires further discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants