Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

precompute offsets ahead-of-time rather than on each dereference #141

Merged
merged 2 commits into from
Sep 23, 2022

Conversation

Ekleog-NEAR
Copy link
Contributor

Criterion results on a test related to near/nearcore-private-1#2:

compile                 time:   [9.7873 ms 9.8477 ms 9.9119 ms]
                        change: [-5.2388% -4.2981% -3.2369%] (p = 0.00 < 0.05)
                        Performance has improved.

Manual testing on the exact test of near/nearcore-private-1#2 confirms the ~5% speedup:

Non-rayon goes down to 24.5-25ms from 25.5-26ms
Rayon goes down to 12ms from 12.5ms

This also feels more like a code cleanup than added complexity for optimization’s sake to me, but I would understand if other people thought the added invariants are added complexity

Criterion results on a test related to nearcore-private-1#2:
compile                 time:   [9.7873 ms 9.8477 ms 9.9119 ms]
                        change: [-5.2388% -4.2981% -3.2369%] (p = 0.00 < 0.05)
                        Performance has improved.

Manual testing on the exact test of nearcore-private-1#2 confirms the ~5% speedup:

Non-rayon goes down to 24.5-25ms from 25.5-26ms
Rayon goes down to 12ms from 12.5ms
@matklad
Copy link
Contributor

matklad commented Sep 22, 2022

Rayon goes down to 12ms from 12.5ms

I think we agrdeed to just kill rayon (near/nearcore#8948), could you send a quick PR to do that as a follow up?


fn precompute(&mut self) {
self.vmctx_signature_ids_begin = 0;
self.vmctx_imported_functions_begin = self
Copy link
Collaborator

@nagisa nagisa Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so much more comprehensible than the previous code already, but I do wonder if there’s an opportunity to further simplify this code with some helpers.

For instance: instead of manually having checked_add and checked_mul, having that in a utility function along the lines of fma (fused multiply-add).

That way we’d be looking at something like

self.vmctx_imported_functions_begin = offset_fma(
    self.num_signature_ids, 
    u32::from(self.size_of_vmshared_signature_index())
    self.vmctx_signature_ids_begin
);

which seems like less noise overall. (names/argument ordering at your choice)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea thanks! I’ve just pushed a commit doing this change :)

@Ekleog-NEAR Ekleog-NEAR merged commit d30991d into near:near-main Sep 23, 2022
bors bot added a commit to wasmerio/wasmer that referenced this pull request Nov 21, 2022
3292: Precompute offsets in VMOffsets r=ptitSeb a=ptitSeb

# Description
Small optimisation: Precompute Offsets in VMOffsets

based on near/wasmer#141

For #3305

Co-authored-by: ptitSeb <sebastien.chev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants