Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Re-enable null check optimization for mid-method traces #84058

Merged
merged 3 commits into from
Apr 3, 2023

Conversation

kg
Copy link
Member

@kg kg commented Mar 29, 2023

This PR moves the ldloca analysis from the jiterpreter trace generator to the pass that inserts entry points, producing a bitset with a 1 for each interpreter stack slot that has its address taken. As a result we're able to remove the 'trace must start at the beginning of the method' restriction for null check optimization, and we pick up some small correctness improvements along with it.

@kg kg added arch-wasm WebAssembly architecture area-Codegen-Jiterpreter-mono labels Mar 29, 2023
@kg kg requested review from lewing and pavelsavara as code owners March 29, 2023 03:13
@ghost ghost assigned kg Mar 29, 2023
@ghost
Copy link

ghost commented Mar 29, 2023

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

Right now traces in the middle of methods don't support null check optimizations since we don't know if ldloca happened previously in the method. This PR adds a quick scan of the method leading up to the trace for ldlocas so we have the information we need for (partial) null check optimization.

Author: kg
Assignees: -
Labels:

arch-wasm, area-Codegen-Jiterpreter-mono

Milestone: -

@kg
Copy link
Member Author

kg commented Mar 29, 2023

cc @BrzVlad if you have time to look, does this make sense to you?

@BrzVlad
Copy link
Member

BrzVlad commented Mar 29, 2023

Interp promotes address taken vars to be global in initialize_global_vars. I think this would be a good starting point for interp to start providing information to jiterp in a well structured fashion, that can be expanded upon later. Even a simple boolean on whether the method has ldloca would still be something so we can avoid unnecessary iteration for the majority of methods.

This is also fine if you prefer to do it later.

@kg
Copy link
Member Author

kg commented Mar 29, 2023

Interp promotes address taken vars to be global in initialize_global_vars. I think this would be a good starting point for interp to start providing information to jiterp in a well structured fashion, that can be expanded upon later. Even a simple boolean on whether the method has ldloca would still be something so we can avoid unnecessary iteration for the majority of methods.

This is also fine if you prefer to do it later.

That is perfect, thanks for pointing out where that happens. I've previously experimented with flowing variable data through with mixed success but I think the next time I try I'll be able to get it in good shape. It's already on the todo list in the tracking issue, I just haven't had time to redo this stuff from the ground up and flow the information through.

A 'this method has ldloca' flag sounds like a really simple starting point that would be valuable.

@kg kg added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Mar 29, 2023
…spaces in the stack have had their address taken

Use the interpreter bitset for jiterpreter null check optimization
Enable null check optimization for all traces
@kg kg force-pushed the wasm-jiterp-null-check-mid-method branch from d566fb6 to 808ac5b Compare March 30, 2023 18:10
@kg kg requested a review from kotlarmilos as a code owner March 30, 2023 18:10
@kg kg removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Mar 30, 2023
@kg
Copy link
Member Author

kg commented Mar 30, 2023

Reworked inspired by @BrzVlad 's feedback: We now build a bitset when inserting trace(s) into a method, where each bit indicates whether the 8 bytes in that slot ever have their address taken during the method.

In my testing this matches the old analysis 99%+ of the time, so it shouldn't cause any performance regressions, but it's also more correct since it will handle the case vlad pointed out (ldlocas after the current instruction we haven't seen yet, in methods with loops). This also handles the hypothetical case of us null check optimizing a field of a stack-allocated struct and then someone ldloca'ing the whole struct, because we can ensure all the slots occupied by the struct have the address-taken flag set now. (The jiterpreter doesn't currently know the type or size of locals)

The memory cost of the bitset feels potentially bad but I'm not sure how I could do much better here. Maybe a packed list of offsets and sizes instead?

@kg kg requested a review from vargaz March 31, 2023 04:42
@@ -942,6 +942,30 @@ trace_info_alloc () {
return index;
}

static void
build_address_taken_bitset (TransformData *td, InterpBasicBlock *bb, guint32 bitset_size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, you can further optimize this a bit. For example, you can add a new field to TransformData, something like max_indirect_offset that you set in

td->locals [var].flags |= INTERP_LOCAL_FLAG_GLOBAL;
to be td->total_locals_size. You can then run this pass only if this field is non zero and you can also have less memory use for the bitset array (note you would need to do a range check in jiterp for overflowing offsets)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great, but I'd prefer to do it in a follow-up PR. Is that OK?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, also if you consider this to be worth it

@kg kg merged commit 4b53510 into dotnet:main Apr 3, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants