-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bench-e2e single mode and keep results #1693
Conversation
506062b
to
9eb745d
Compare
Transaction costsSizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using
Script summary
|
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 5096 | 5.79 | 2.29 | 0.44 |
2 | 5297 | 7.09 | 2.80 | 0.46 |
3 | 5499 | 8.73 | 3.46 | 0.49 |
5 | 5901 | 11.26 | 4.45 | 0.53 |
10 | 6907 | 18.11 | 7.16 | 0.65 |
57 | 16355 | 82.91 | 32.79 | 1.78 |
Commit
transaction costs
This uses ada-only outputs for better comparability.
UTxO | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 567 | 10.84 | 4.26 | 0.29 |
2 | 758 | 14.31 | 5.80 | 0.34 |
3 | 944 | 17.92 | 7.39 | 0.39 |
5 | 1323 | 25.56 | 10.73 | 0.49 |
10 | 2257 | 47.11 | 19.97 | 0.77 |
19 | 3947 | 94.71 | 39.81 | 1.38 |
CollectCom
transaction costs
Parties | UTxO (bytes) | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|---|
1 | 57 | 560 | 20.58 | 7.85 | 0.40 |
2 | 113 | 671 | 28.02 | 10.67 | 0.48 |
3 | 171 | 782 | 37.34 | 14.18 | 0.59 |
4 | 227 | 893 | 47.04 | 17.86 | 0.70 |
5 | 282 | 1009 | 56.92 | 21.60 | 0.81 |
6 | 340 | 1116 | 67.08 | 25.44 | 0.92 |
7 | 396 | 1227 | 67.51 | 25.68 | 0.93 |
8 | 448 | 1338 | 80.43 | 30.57 | 1.08 |
9 | 504 | 1449 | 80.47 | 30.65 | 1.09 |
Cost of Decrement Transaction
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 650 | 18.40 | 8.07 | 0.39 |
2 | 731 | 18.50 | 8.83 | 0.40 |
3 | 858 | 19.95 | 10.17 | 0.42 |
5 | 1179 | 23.56 | 13.12 | 0.49 |
10 | 1989 | 32.96 | 20.62 | 0.65 |
47 | 7736 | 96.50 | 73.89 | 1.80 |
Close
transaction costs
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 672 | 20.87 | 9.34 | 0.42 |
2 | 846 | 22.67 | 11.10 | 0.45 |
3 | 915 | 23.70 | 12.18 | 0.47 |
5 | 1199 | 26.64 | 15.09 | 0.53 |
10 | 1887 | 34.14 | 22.55 | 0.67 |
50 | 8222 | 99.29 | 86.57 | 1.94 |
Contest
transaction costs
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 691 | 26.76 | 11.48 | 0.48 |
2 | 765 | 28.12 | 12.68 | 0.50 |
3 | 968 | 30.31 | 14.59 | 0.54 |
5 | 1204 | 33.94 | 17.72 | 0.61 |
10 | 2027 | 43.82 | 26.44 | 0.78 |
39 | 6521 | 99.67 | 75.65 | 1.79 |
Abort
transaction costs
There is some variation due to the random mixture of initial and already committed outputs.
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 4971 | 15.30 | 6.55 | 0.54 |
2 | 5053 | 21.29 | 9.07 | 0.61 |
3 | 5180 | 25.54 | 10.89 | 0.66 |
4 | 5288 | 32.14 | 13.76 | 0.74 |
5 | 5680 | 44.07 | 19.41 | 0.90 |
6 | 5625 | 49.48 | 21.48 | 0.95 |
7 | 5852 | 56.62 | 24.78 | 1.04 |
8 | 6131 | 64.89 | 28.64 | 1.15 |
9 | 6024 | 65.89 | 28.54 | 1.15 |
10 | 6293 | 76.21 | 33.31 | 1.28 |
11 | 6475 | 87.40 | 38.25 | 1.42 |
12 | 6408 | 91.57 | 39.93 | 1.46 |
13 | 6565 | 96.51 | 41.91 | 1.52 |
FanOut
transaction costs
Involves spending head output and burning head tokens. Uses ada-only UTxO for better comparability.
Parties | UTxO | UTxO (bytes) | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|---|---|
10 | 0 | 0 | 5090 | 10.38 | 4.35 | 0.49 |
10 | 1 | 57 | 5123 | 10.96 | 4.81 | 0.50 |
10 | 5 | 285 | 5260 | 15.83 | 7.79 | 0.57 |
10 | 10 | 570 | 5430 | 21.87 | 11.49 | 0.65 |
10 | 20 | 1138 | 5768 | 33.16 | 18.55 | 0.81 |
10 | 30 | 1710 | 6111 | 45.44 | 26.05 | 0.97 |
10 | 40 | 2279 | 6451 | 56.74 | 33.12 | 1.13 |
10 | 50 | 2848 | 6789 | 68.63 | 40.45 | 1.30 |
10 | 76 | 4321 | 7664 | 99.27 | 59.39 | 1.72 |
End-to-end benchmark results
This page is intended to collect the latest end-to-end benchmark results produced by Hydra's continuous integration (CI) system from the latest master
code.
Please note that these results are approximate as they are currently produced from limited cloud VMs and not controlled hardware. Rather than focusing on the absolute results, the emphasis should be on relative results, such as how the timings for a scenario evolve as the code changes.
Generated at 2024-10-10 10:34:24.332196246 UTC
Baseline Scenario
Number of nodes | 1 |
---|---|
Number of txs | 300 |
Avg. Confirmation Time (ms) | 5.551930950 |
P99 | 12.216956419999997ms |
P95 | 7.627868650000001ms |
P50 | 5.3302375ms |
Number of Invalid txs | 0 |
Three local nodes
Number of nodes | 3 |
---|---|
Number of txs | 900 |
Avg. Confirmation Time (ms) | 24.358177392 |
P99 | 48.859335549999486ms |
P95 | 32.6859822ms |
P50 | 22.443132ms |
Number of Invalid txs | 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments; happy to merge if all the tests pass!
In fact the network tests are failing - https://github.com/cardano-scaling/hydra/actions/runs/11241196905/job/31255255123?pr=1693 |
0b81249
to
1dbd899
Compare
1dbd899
to
5b34aa0
Compare
This is not ideal, but a lot simpler than doing proper fee calculation. It's unclear why fee calculation was removed before, it is needed when running benchmark scenarios.
This is redundant and can be achieved by using the 'datasets' subcommand.
Before it was written to a random temporary directory, which makes it annoying to generate datasets with this mode.
They hydra-cluster benchmarks now only uses a single directory to store the whole state, which is temporary unless a specific output-directory is requested.
This reduces some code duplication without much loss of expressiveness (which key we use does not matter).
Same transaction style (single repending txs), but deliberately smaller length of transactions (3000 -> 300) to have shorter benchmark run-times, while sequence should be long enough to identify regressions. Generated with invocations: cabal run bench-e2e -- single --cluster-size 1 --scaling-factor 10 and cabal run bench-e2e -- single --cluster-size 3 --scaling-factor 10 Plus some manual amending of the JSON to contain a "title".
As before, the bench-e2e does not assume the hydra node keys to be seeded. This ties the way bench-e2e binary (which hard-codes Alice, Bob and Carol) to the configurable list of --hydra-client to connect to.
This decouples the bench-e2e binary which just produces load and provides statistics more from how the hydra-nodes are run. Now the only assumption is that the 'hydra-cluster/config/credentials/faucet.sk' owns funds on the given network.
4b50543
to
a48d056
Compare
This fixes two issues with the
bench-e2e
binary / benchmark:single
mode was not working because of aFeeTooSmallUTxO
errorresults.csv
is written into a temporary directory and removed, which makes plotting impossible.I was in the mood of some refactoring so this contains also various other changes I encountered while working on the code and I was tidying up a bit.
The refactoring separated hydra node and payment keys further, which requires the datasets to be re-generated. I took the freedom to generate with
--scaling-factor 10
which results in300
transactions per client. Should be long enough to identify regressions, with hopefully 10x shorter benchmark time in CI.Another benefit of this separation is that it naturally led to reducing the assumptions of the
demo
mode by not seeding the hydra node cardano keys, but re-usingseed-devnet.sh
and consequently looser coupling between the workload and container setup in our network test workflow.I'm not 100% happy with how the bench is now requiring the
--output-directory
to be empty, and in turn the whole state will be captured as an artifact of our CI. Instead, making the state directory always a /tmp path and retained in case of errors (or configurable with--state-directory
) would be better. But that can go into another PR .. another time.