Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM global constructor support when targeting WebAssembly #82371

Closed
chinedufn opened this issue Feb 21, 2021 · 19 comments
Closed

LLVM global constructor support when targeting WebAssembly #82371

chinedufn opened this issue Feb 21, 2021 · 19 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. O-wasm Target: WASM (WebAssembly), http://webassembly.org/

Comments

@chinedufn
Copy link
Contributor

chinedufn commented Feb 21, 2021

Opening this PR as a followup to @alexcrichton 's comment on supporting static constructors in wasm-bindgen rustwasm/wasm-bindgen#1216 (comment) .

I'd like to implement support for static constructors in wasm on the rustc side.

I'm a first time contributor that could use some guidance.

I've outlined what I think needs to get done below. In areas where it seemed much easier to ask for help than to try to figure it out myself, I left notes about the sort of guidance that I thought I needed.

I don't know what I don't know though, so please keep your eyes peeled for incorrect assumptions or things that I've completely overlooked.

Or just generally any other advice. All tips are welcomed.

Thank you.

Simple Illustration

Given the code

#![feature(wasm_global_ctors)]

#[wasm_global_ctor]
pub fn some_function() {
    log("Hello, ctor!");
}

#[wasm_global_ctor]
pub fn another_function() {
    log("Hello, ctor!");
}

We need to emit the following the LLVM IR. (Thanks to @Frizi for investigating this.)

target triple = "wasm32-unknown-unknown"

declare hidden void @call_ctor()
declare hidden void @call_ctor2()

@llvm.global_ctors = appending global [2 x { i32, void ()*, i8* }] [
  { i32, void ()*, i8* } { i32 4000, void ()* @some_function, i8* null },
  { i32, void ()*, i8* } { i32 4000, void ()* @another_function, i8* null }
]

Note that we should also support the #[wasm_global_ctor] attribute for static FOO: extern "C" fn() = { /* ... */ } function pointers, but I'd like to focus on landing function support first unless someone says otherwise.

(Note that I have no real experience working with LLVM directly so let me know if I am missing something.)

Test Cases

Here are the test cases that I think will be needed.

I noted the ones where I could use some guidance on how/where to best implement them.

Feature Gate

Introduce these files:

src/test/ui/feature-gates/issue-XXXX-gating-of-wasm-global-ctor-error.rs
src/test/ui/feature-gates/issue-XXXX-gating-of-wasm-global-ctor-error.stderr

Verifying the #[wasm_global_ctor] is ignored on non wasm32 target

Guidance on where to add this and links to similar existing tests would be helpful please.

Verifying the #[wasm_global_ctor] leads to correct @llvm.global_ctors

Guidance on where to add these and links to similar existing tests would be helpful please.

  • A test case with one #[wasm_global_ctor] function

  • A test case with one #[wasm_global_ctor] STATIC variable that is a function pointer

  • A test case with two #[wasm_global_ctor] attributes so that we can verify that we properly handle multiple attribute instances.

Parsing #[wasm_global_ctor] attribute

Here's what I think is needed to parse the #[wasm_global_ctor] attribute.


Create a symbol for wasm_global_ctor

wasm_import_module,


Add a method to check whether wasm_global_ctor is applied to a function

https://github.com/rust-lang/rust/blob/master/compiler/rustc_passes/src/check_attr.rs#L69


Add CodegenFnAttrFlags::WASM_GLOBAL_CTOR

pub struct CodegenFnAttrFlags: u32 {

And set it in the codegen_fn_attrs

} else if tcx.sess.check_name(attr, sym::used) {


Add a gated!(wasm_global_ctor, AssumeUsed, template!(Word)

https://github.com/rust-lang/rust/blob/dbdbd30bf2cb0d48c8bbce83c2458592664dbb18/compiler/rustc_feature/src/builtin_attrs.rs#L236-L235

Generating the LLVM IR

Guidance on where/how to do these along with some links to similar code would be helpful please.

  • We need to store a set of all global constructor function names that we come across.

  • At some point we need to iterate that set and generate the appropriate LLVM IR (example of how this looks can be found above)

Summary

Please let me know if there are key pieces that are not outlined above and I will add them in.

I'd like to implement this if that's alright, so all guidance and tips are very much appreciated.

@jonas-schievink jonas-schievink added C-feature-request Category: A feature request, i.e: not implemented / a PR. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ labels Feb 21, 2021
@nagisa
Copy link
Member

nagisa commented Feb 21, 2021

Isn't this the life-before-main stuff that we've been historically against adding to Rust? It seems that adding a way to specify llvm.global_ctors would be going against that philosophy and would require an RFC.

@chinedufn
Copy link
Contributor Author

chinedufn commented Feb 21, 2021

This was discussed in these two comments:

rustwasm/wasm-bindgen#1216 (comment)
rustwasm/wasm-bindgen#1216 (comment)

Over time philosophies shift ever so slightly and such, but nowadays I think it'd be fine to roughly add something like this. We don't want to hide features in rustc that a platform has, and as mmastrac has shown this is already somewhat supported on platforms today!


My particular use case is that I want to be able to use typetag and rust-ctor in wasm.

My takeaway was that since static constructors are already possible on a number of other targets (Linux, OSX, FreeBSD, NetBSD, OpenBSD, Android, iOS, and Windows) there is no reason to not allow them to be used in wasm.


Also, just to clarify, this issue only addresses being able to make use of llvm.global_ctors in wasm in order to enable use cases that are already possible on a number of other targets.

This issue does not seek to expose llvm.global_ctors related flags to rustc or anything of that sort.

@ghost
Copy link

ghost commented Feb 21, 2021

My takeaway was that since static constructors are already possible on a number of other targets (Linux, OSX, FreeBSD, NetBSD, OpenBSD, Android, iOS, and Windows) there is no reason to not allow them to be used in wasm.

FWIW I read this in rust-lang/miri#450 (comment):

The only reason any of this works is because it's an OS feature we can't really stop you from opting into.

But... could .init_array and .fini_array be made to work on WASM (if possible at all, I'm not really sure)?

@nagisa
Copy link
Member

nagisa commented Feb 21, 2021

I'm not a sure a single comment by one person is sufficient grounds to change the overall direction of a project as large as Rust.

We can't do much about people abusing a more general mechanism (#[link_section] for placing code in arbitrary sections) to manually implement life-before-main. Them doing so doesn't mean the project's take on the topic has changed and that we should build into the language/compiler a feature predominantly designed to achieve life-before-main. Even if the functionality is target-specific for a target which can't do so through similarly underhanded means.

Besides the general direction, there are a number of issues with relying on LLVM to implement this. For instance it is not obvious what the implications of this feature are wrt other backends. And if we do implement this for wasm… why not other targets as well? Why not add the destructors as well? Is a function attribute the best design? Can people call these constructor functions from after-main code too? Why/why not? RFC should answer these and other questions.

@chinedufn
Copy link
Contributor Author

I am not suggesting that it is. I also don't know if he feels that way today. Just linking to some prior discussion and sharing how I interpreted it.

Them doing so doesn't mean the project's take on the topic has changed and that we should build into the language/compiler a feature predominantly designed to achieve life-before-main. Even if the functionality is target-specific for a target which can't do so through similarly underhanded means.

This makes sense to me, thank you.

Why/why not? RFC should answer these and other questions.

Thanks for explaining.

I think that I was a bit blinded by trying to solve my own problem (typetag does not currently work in wasm), and ended up ignoring the wider implications.

My current thinking is that if Rust has a strong stance against supporting before-main features, then it would not be useful to submit an RFC for a before-main feature.

@nagisa
Copy link
Member

nagisa commented Feb 21, 2021

My current thinking is that if Rust has a strong stance against supporting before-main features, then it would not be useful to submit an RFC for a before-main feature.

I think its going to be difficult to correctly gauge the sentiment today without writing at least a pre-RFC on internals.rlo. The sentiment was definitely negative in the past, but more than once a good RFC changed the opinions and features got implemented. It is feasible that an RFC for such a feature could be accepted today, especially now that there are somewhat widespread use-cases for the feature.

@dtolnay
Copy link
Member

dtolnay commented Feb 21, 2021

FWIW as the author of the mentioned typetag crate and the underlying inventory crate that makes it work, I am also opposed to putting anything specific to that model into rustc. I strongly believe https://github.com/dtolnay/linkme is the correct model to pursue in rustc and does not involve any life before main.

@chinedufn
Copy link
Contributor Author

chinedufn commented Feb 21, 2021

@nagisa awesome, thanks for explaining that.

It looks like there was a pre-RFC on the global constructor approach this with a lot of discussion. https://internals.rust-lang.org/t/pre-rfc-add-language-support-for-global-constructor-functions/9840

This comment about ditching global constructors for a different approach stood out to me: https://internals.rust-lang.org/t/pre-rfc-add-language-support-for-global-constructor-functions/9840/10

I'm going to do a bit more research on the linkme approach and then start working on a pre-RFC.

Thanks for your help and suggestions.

@ghost
Copy link

ghost commented Feb 21, 2021

Personally I'm still curious about #[link_section = ".init_array"] on WASM. That seems a useful workaround and maybe useful elsewhere, and I guess it only requires minor changes.

@alexcrichton
Copy link
Member

I only briefly skimmed this thread via notifications and just now, but I personally believe it's important for rustc to provide bindings to the features of the WebAssembly target. I don't think that there's really any reason that design of the standard library and idioms of a language should forbid usage of a feature on a target.

I'm not personally convinced that the design proposed here is appropriate for exposing the start function feature of WebAssembly, but saying "Rust doesn't currently have life before main" I do not think is a good reason to say that Rust will never support the start function. That seems harmful to Rust's portability to WebAssembly.

@chinedufn
Copy link
Contributor Author

chinedufn commented Feb 23, 2021

Looked into the linkme approach.

As far as I can tell, it is not possible for a linkme (or a hypothetical RFC'd rustc first-class linkme like construct) based typetag to work in wasm, since custom sections in wasm cannot container function pointers.

https://github.com/rust-lang/rust/blob/e80c86c535b74cdd65c4b75b94ce38342d0bb946/src/librustc_typeck/check/mod.rs#L1444-L1448

Not say that it is definitely impossible. I just don't currently see any way to hack around that limitation after looking into it for a bit.


For posterity.

I'm just going to use a hand maintained enum for serialization/deserialization.

Slightly less convenient than typetag, but meets all of my needs just fine for now.


Thanks a lot for the help everyone.

@dtolnay
Copy link
Member

dtolnay commented Feb 23, 2021

@chinedufn the point of building it as a rustc feature instead of emulating it in the linker is to avoid any link_section related platform specific restrictions. For example, #[test] already works like this on wasm. The compiler builds the table of function pointers, not the linker.

@chinedufn
Copy link
Contributor Author

Ahhhhhhhhh now I see. Ok I may be back in the game here. I've been itching to make my first rustc RFC / implementation.


Ok here's my understanding of the very high level gist of this:

So, basically, in wasm we generate a table with all of the functions and it would be up to the user to read and make use of this at runtime (probably within something like lazy_static!).

In not(wasm) we would just straight up set the value of the static list at compile time for you (Are there any good examples of rustc initializing a static variable for you that you can link me to so that I can read up?).

@dtolnay mind linking me to where all of this stuff is handled for #[test] in wasm so that I can read up?

Thank you for explaining this. I'll try and find code examples to read myself, but if in the meantime if you have links for any of the above that would be awesome.

@ghost
Copy link

ghost commented Feb 23, 2021

I believe "to avoid any link_section related platform specific restrictions" means the compiler should do the same thing for wasm and not(wasm) totally at compile time, that is, it should "just straight up set the value of the static list at compile time for you" on all targets.

Personally I expect the compiler to generate something like extern "Rust" { static THE_SLICE: &[T]; } for #[distributed_slice], emit something in the (downstream) crate metadata for #[distributed_slice(SOMETHING)] variables just like public (to the compiler only) consts, collect all those #[distributed_slice(SOMETHING)] variables from the metadata (and the compiling crate itself) when compiling the "final" (bin/cdylib/staticlib) crate, and emit a static THE_SLICE: &[T] = &[crate::COLLECTED_CONSTANT, dependency::COLLECTED_CONSTANT] (that defines the initial extern) (in the final crate).

Also AFAIK #[test] only works within one crate, but something like linkme needs to work across different crates, so what #[test] does probably won't help much.

@chinedufn
Copy link
Contributor Author

@hyd-dev thanks for illustrating.

I think the big thing missing for me in order to tackle an RFC is some links to parts of the rustc codebase that I can read over in order to better understand where these pieces might fit together.

I can poke around and try to piece things together (i.e. I need to take a look through the rustc dev book to start getting myself familiar), but if you have any links to the rustc codebase that relate to parts of what you've just explained, that would save me a bunch of time.

Thank you for the help.

@ghost

This comment has been minimized.

@LLBlumire
Copy link
Contributor

@chinedufn has there been any progress on this RFC? It seems like this issue is the block on rustwasm/wasm-bindgen#1216 which has been meaningfully blocked for 3 years now, and I'm attempting to pick up the gauntlet of getting typetag working on WASM.

If there has not been, is it worth re-openning this issue, as it seems its still a feature missing from Rust which needs addressing?

@workingjubilee
Copy link
Member

Programming languages are defined by what they do not include. Rust is technically Turing-equivalent, but it makes itself very hard to use it that way. Meanwhile, adding platform-specific support for this directly to rustc is tantamount to adding life-before-main to the Rust programming model. And then from there, once it is in Rust-on-wasm, it is easy to argue that this support should be extended to other platforms. Remember, we have OsStr almost entirely because Windows is stubbornly eccentric. Adding support for a platform-specific eccentricity can and historically has distorted the programming model for all platforms.

wasm's refusal to adopt a sensible and reasonably deterministic model for floats certainly already induces enough problems for trying to formalize Rust irrespective of platform.

And then with global constructors and life before main being formally part of Rust, we have something that will almost certainly involve lifetimes, yet may be poorly checked or verified by the compiler, and thus outside the actual Rust type system, despite being in the programming model. I think dtolnay is right to assert that these sorts of things should be compile-time only, as that is the easiest way to avoid excessive nonsense. Alternatively, an actual model for their inclusion, a formal form of life-before-main, should actually be built out first.

@dtolnay
Copy link
Member

dtolnay commented Jan 15, 2023

The next step is somebody needs to write an RFC for distributed_slice and put up a compiler-based (not linker-based) prototype implementation.

This is not the right issue for that so it's not going to be reopened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. O-wasm Target: WASM (WebAssembly), http://webassembly.org/
Projects
None yet
Development

No branches or pull requests

7 participants