Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[forge] indexer SuccessCriteria and new test #14851

Merged
merged 2 commits into from
Oct 8, 2024

Conversation

rustielin
Copy link
Contributor

@rustielin rustielin commented Oct 2, 2024

Description

Adds some latency-based success criteria to Forge runs with the indexer.

Also start to add a new class of indexer_* test suites

How Has This Been Tested?

CI. Also run it ad-hoc many times. Use this to calibrate some of the latency criteria: https://github.com/aptos-labs/aptos-core/actions/runs/11185219106 (see the run history on GHA)

  1. run1
  2. run2
  3. run3
  4. run4
  5. run5
  6. run6
  7. run7

Key Areas to Review

Metrics used for the success criteria; new test suite workloads

For the success criteria thresholds, based on some numbers from:

For example:
image

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Oct 2, 2024

⏱️ 1h 24m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
test-target-determinator 17m 🟩🟩🟩🟩
rust-cargo-deny 14m 🟩🟩🟩🟩🟩 (+3 more)
check-dynamic-deps 11m 🟩🟩🟩🟩🟩 (+4 more)
execution-performance / test-target-determinator 6m 🟩
rust-doc-tests 4m 🟩
check 4m 🟩
semgrep/ci 4m 🟩🟩🟩🟩🟩 (+4 more)
general-lints 3m 🟩🟩🟩🟩🟩 (+3 more)
rust-move-tests 2m 🟩
rust-move-tests 2m 🟩
rust-move-tests 2m 🟩
fetch-last-released-docker-image-tag 2m 🟩
file_change_determinator 2m 🟩🟩🟩🟩 (+4 more)
rust-move-tests 2m 🟩
rust-move-tests 2m 🟩

🚨 2 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
execution-performance / test-target-determinator 6m 4m +38%
execution-performance / single-node-performance 10s 18m -99%

settingsfeedbackdocs ⋅ learn more about trunk.io

@rustielin rustielin changed the title [WIP][forge] indexer SuccessCriteria [WIP][forge] indexer SuccessCriteria and new test Oct 3, 2024
@rustielin rustielin force-pushed the rustielin/forge-indexer-succ branch 2 times, most recently from 3f458a2 to a6c75e2 Compare October 3, 2024 18:09
@rustielin rustielin added the CICD:build-images when this label is present github actions will start build+push rust images from the PR. label Oct 3, 2024
@rustielin rustielin marked this pull request as ready for review October 8, 2024 17:48
@rustielin rustielin changed the title [WIP][forge] indexer SuccessCriteria and new test [forge] indexer SuccessCriteria and new test Oct 8, 2024
Copy link
Contributor

@sitalkedia sitalkedia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - let's ensure the test is stable for few days before adding it to the forge stable suite.

@rustielin rustielin enabled auto-merge (squash) October 8, 2024 21:01

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Oct 8, 2024

✅ Forge suite compat success on 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> f6581bd1f2d7c5d2ee568fd2e124d472d1b04317

Compatibility test results for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> f6581bd1f2d7c5d2ee568fd2e124d472d1b04317 (PR)
1. Check liveness of validators at old version: 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775
compatibility::simple-validator-upgrade::liveness-check : committed: 13473.66 txn/s, latency: 2420.21 ms, (p50: 1900 ms, p70: 2100, p90: 4200 ms, p99: 11400 ms), latency samples: 519420
2. Upgrading first Validator to new version: f6581bd1f2d7c5d2ee568fd2e124d472d1b04317
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 6720.64 txn/s, latency: 4195.20 ms, (p50: 4800 ms, p70: 5100, p90: 5300 ms, p99: 5300 ms), latency samples: 123420
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 5725.65 txn/s, latency: 5555.78 ms, (p50: 5700 ms, p70: 5800, p90: 7700 ms, p99: 8100 ms), latency samples: 195840
3. Upgrading rest of first batch to new version: f6581bd1f2d7c5d2ee568fd2e124d472d1b04317
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6583.01 txn/s, latency: 4167.64 ms, (p50: 4600 ms, p70: 4900, p90: 5500 ms, p99: 5800 ms), latency samples: 118960
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 7050.84 txn/s, latency: 4561.05 ms, (p50: 4700 ms, p70: 4900, p90: 6300 ms, p99: 6800 ms), latency samples: 235760
4. upgrading second batch to new version: f6581bd1f2d7c5d2ee568fd2e124d472d1b04317
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 9308.09 txn/s, latency: 2813.01 ms, (p50: 2900 ms, p70: 3000, p90: 4400 ms, p99: 5000 ms), latency samples: 190920
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 10096.68 txn/s, latency: 3063.42 ms, (p50: 2800 ms, p70: 2900, p90: 5900 ms, p99: 7600 ms), latency samples: 327860
5. check swarm health
Compatibility test for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> f6581bd1f2d7c5d2ee568fd2e124d472d1b04317 passed
Test Ok

Copy link
Contributor

github-actions bot commented Oct 8, 2024

✅ Forge suite realistic_env_max_load success on f6581bd1f2d7c5d2ee568fd2e124d472d1b04317

two traffics test: inner traffic : committed: 12950.96 txn/s, latency: 3075.68 ms, (p50: 2700 ms, p70: 2900, p90: 3300 ms, p99: 9700 ms), latency samples: 4924480
two traffics test : committed: 100.06 txn/s, latency: 3127.97 ms, (p50: 2400 ms, p70: 2700, p90: 6800 ms, p99: 11200 ms), latency samples: 1780
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.234, avg: 0.220", "QsPosToProposal: max: 0.331, avg: 0.231", "ConsensusProposalToOrdered: max: 0.325, avg: 0.300", "ConsensusOrderedToCommit: max: 0.484, avg: 0.439", "ConsensusProposalToCommit: max: 0.786, avg: 0.740"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.96s no progress at version 32746 (avg 0.21s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 8.59s no progress at version 1677993 (avg 8.20s) [limit 15].
Test Ok

Copy link
Contributor

github-actions bot commented Oct 8, 2024

✅ Forge suite framework_upgrade success on 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> f6581bd1f2d7c5d2ee568fd2e124d472d1b04317

Compatibility test results for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> f6581bd1f2d7c5d2ee568fd2e124d472d1b04317 (PR)
Upgrade the nodes to version: f6581bd1f2d7c5d2ee568fd2e124d472d1b04317
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1141.89 txn/s, submitted: 1144.55 txn/s, failed submission: 2.67 txn/s, expired: 2.67 txn/s, latency: 2625.27 ms, (p50: 2300 ms, p70: 2800, p90: 4500 ms, p99: 6300 ms), latency samples: 102800
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1159.02 txn/s, submitted: 1161.45 txn/s, failed submission: 2.42 txn/s, expired: 2.42 txn/s, latency: 2569.80 ms, (p50: 2100 ms, p70: 2700, p90: 4800 ms, p99: 6300 ms), latency samples: 105280
5. check swarm health
Compatibility test for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> f6581bd1f2d7c5d2ee568fd2e124d472d1b04317 passed
Upgrade the remaining nodes to version: f6581bd1f2d7c5d2ee568fd2e124d472d1b04317
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1210.15 txn/s, submitted: 1212.15 txn/s, failed submission: 2.00 txn/s, expired: 2.00 txn/s, latency: 2427.41 ms, (p50: 2200 ms, p70: 2600, p90: 3800 ms, p99: 5600 ms), latency samples: 108800
Test Ok

@rustielin rustielin merged commit ebc1b64 into main Oct 8, 2024
93 checks passed
@rustielin rustielin deleted the rustielin/forge-indexer-succ branch October 8, 2024 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:build-images when this label is present github actions will start build+push rust images from the PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants