Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler-v2] Optimize stackless-bytecode assign instructions #15445

Merged
merged 2 commits into from
Dec 4, 2024

Conversation

vineethk
Copy link
Contributor

@vineethk vineethk commented Dec 2, 2024

Description

When translating an Assign(dst, src) stackless bytecode instruction to file format bytecode, we copy or move src onto the stack, then eagerly store it to a local corresponding to dst.

Instead, now, when the AVOID_STORE_IN_ASSIGNS experiment is enabled, once we copy or move src onto the stack, we just rename that value to dst (in many cases, this avoids a store).

Another way to think about this optimization: when we have dst = src;, it is now treated similar to dst = noop_function(src);: we push src as if it were an argument, "consume" it (leading to popping it off the abstract stack), and then push dst onto the abstract stack as a result of this noop_function call.

  • With this optimization on, the number of bytecode instructions produced by v2 compiler for the aptos-framework (and all of its dependencies) is about 1% lower than v1 compiler!
  • Compared to without this optimization, v2 compiler with this optimization produces 2.2% fewer instructions when compiling aptos-framework and its dependencies.

The max gas taken for executing gov. proposal also goes down (by about 2.5%) with compiler v2 compared to compiler v1.

How Has This Been Tested?

Key Areas to Review

  • Correctness of the optimization

Type of Change

  • Performance improvement

Which Components or Systems Does This Change Impact?

  • Move Compiler

Copy link

trunk-io bot commented Dec 2, 2024

⏱️ 4h 48m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
execution-performance / single-node-performance 1h 31m 🟩🟩🟩🟩
test-target-determinator 17m 🟩🟩🟩🟩
execution-performance / test-target-determinator 17m 🟩🟩🟩🟩
check 15m 🟩🟩🟩🟩
rust-cargo-deny 14m 🟩🟩🟩🟩 (+3 more)
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
check-dynamic-deps 9m 🟩🟩🟩🟩🟩 (+3 more)
rust-move-tests 8m 🟥
fetch-last-released-docker-image-tag 6m 🟩🟩🟩🟩
rust-doc-tests 6m 🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link
Contributor Author

vineethk commented Dec 2, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

@vineethk vineethk marked this pull request as ready for review December 2, 2024 20:27
@vineethk vineethk requested review from rahxephon89, fEst1ck and brmataptos and removed request for davidiw, areshand and movekevin December 2, 2024 20:28
8: Add
9: StLoc[3](loc0: u64)
10: Branch(2)
6: MoveLoc[2](Arg2: u64)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only test case where we produce more instructions.

@vineethk vineethk enabled auto-merge (squash) December 4, 2024 02:08

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Dec 4, 2024

✅ Forge suite realistic_env_max_load success on 20449fa043da5443e728e3c4677ab9e719e79db1

two traffics test: inner traffic : committed: 14395.99 txn/s, latency: 2764.41 ms, (p50: 2700 ms, p70: 2700, p90: 2800 ms, p99: 3200 ms), latency samples: 5473620
two traffics test : committed: 100.02 txn/s, latency: 1956.10 ms, (p50: 1700 ms, p70: 2000, p90: 2100 ms, p99: 18000 ms), latency samples: 1680
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 2.226, avg: 1.332", "ConsensusProposalToOrdered: max: 0.317, avg: 0.287", "ConsensusOrderedToCommit: max: 0.300, avg: 0.292", "ConsensusProposalToCommit: max: 0.586, avg: 0.579"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 1.01s no progress at version 2479353 (avg 0.20s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 15.90s no progress at version 2479351 (avg 15.90s) [limit 16].
Test Ok

Copy link
Contributor

github-actions bot commented Dec 4, 2024

✅ Forge suite framework_upgrade success on 3527aa2e299553b759c515d9843586bad48c802c ==> 20449fa043da5443e728e3c4677ab9e719e79db1

Compatibility test results for 3527aa2e299553b759c515d9843586bad48c802c ==> 20449fa043da5443e728e3c4677ab9e719e79db1 (PR)
Upgrade the nodes to version: 20449fa043da5443e728e3c4677ab9e719e79db1
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 457.35 txn/s, submitted: 485.22 txn/s, failed submission: 0.88 txn/s, expired: 27.87 txn/s, latency: 4679.04 ms, (p50: 2100 ms, p70: 3800, p90: 15700 ms, p99: 19000 ms), latency samples: 41567
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 490.01 txn/s, submitted: 522.80 txn/s, failed submission: 1.16 txn/s, expired: 32.78 txn/s, latency: 6729.48 ms, (p50: 1500 ms, p70: 2100, p90: 26700 ms, p99: 27800 ms), latency samples: 33930
5. check swarm health
Compatibility test for 3527aa2e299553b759c515d9843586bad48c802c ==> 20449fa043da5443e728e3c4677ab9e719e79db1 passed
Upgrade the remaining nodes to version: 20449fa043da5443e728e3c4677ab9e719e79db1
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1253.89 txn/s, submitted: 1256.32 txn/s, failed submission: 2.43 txn/s, expired: 2.43 txn/s, latency: 2365.96 ms, (p50: 2000 ms, p70: 2400, p90: 4200 ms, p99: 6100 ms), latency samples: 113400
Test Ok

Copy link
Contributor

github-actions bot commented Dec 4, 2024

✅ Forge suite compat success on 3527aa2e299553b759c515d9843586bad48c802c ==> 20449fa043da5443e728e3c4677ab9e719e79db1

Compatibility test results for 3527aa2e299553b759c515d9843586bad48c802c ==> 20449fa043da5443e728e3c4677ab9e719e79db1 (PR)
1. Check liveness of validators at old version: 3527aa2e299553b759c515d9843586bad48c802c
compatibility::simple-validator-upgrade::liveness-check : committed: 14247.80 txn/s, latency: 2422.05 ms, (p50: 2100 ms, p70: 2200, p90: 2600 ms, p99: 7600 ms), latency samples: 531140
2. Upgrading first Validator to new version: 20449fa043da5443e728e3c4677ab9e719e79db1
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7230.55 txn/s, latency: 3809.58 ms, (p50: 3800 ms, p70: 4500, p90: 5200 ms, p99: 5700 ms), latency samples: 134620
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 7424.95 txn/s, latency: 4377.34 ms, (p50: 4500 ms, p70: 4700, p90: 6200 ms, p99: 6400 ms), latency samples: 248280
3. Upgrading rest of first batch to new version: 20449fa043da5443e728e3c4677ab9e719e79db1
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6932.96 txn/s, latency: 4096.36 ms, (p50: 4400 ms, p70: 4600, p90: 5300 ms, p99: 5700 ms), latency samples: 130340
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 7172.87 txn/s, latency: 4534.61 ms, (p50: 4800 ms, p70: 5000, p90: 5900 ms, p99: 6600 ms), latency samples: 245460
4. upgrading second batch to new version: 20449fa043da5443e728e3c4677ab9e719e79db1
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 1310.57 txn/s, latency: 17544.44 ms, (p50: 18600 ms, p70: 24200, p90: 27900 ms, p99: 28900 ms), latency samples: 56760
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 8016.65 txn/s, latency: 4052.66 ms, (p50: 2600 ms, p70: 2700, p90: 11700 ms, p99: 12300 ms), latency samples: 259740
5. check swarm health
Compatibility test for 3527aa2e299553b759c515d9843586bad48c802c ==> 20449fa043da5443e728e3c4677ab9e719e79db1 passed
Test Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants