Fix bench-e2e single mode and keep results #1693

ch1bo · 2024-10-08T17:49:24Z

This fixes two issues with the bench-e2e binary / benchmark:

Running in single mode was not working because of a FeeTooSmallUTxO error
The results.csv is written into a temporary directory and removed, which makes plotting impossible.

I was in the mood of some refactoring so this contains also various other changes I encountered while working on the code and I was tidying up a bit.

The refactoring separated hydra node and payment keys further, which requires the datasets to be re-generated. I took the freedom to generate with --scaling-factor 10 which results in 300 transactions per client. Should be long enough to identify regressions, with hopefully 10x shorter benchmark time in CI.

Another benefit of this separation is that it naturally led to reducing the assumptions of the demo mode by not seeding the hydra node cardano keys, but re-using seed-devnet.sh and consequently looser coupling between the workload and container setup in our network test workflow.

I'm not 100% happy with how the bench is now requiring the --output-directory to be empty, and in turn the whole state will be captured as an artifact of our CI. Instead, making the state directory always a /tmp path and retained in case of errors (or configurable with --state-directory) would be better. But that can go into another PR .. another time.

CHANGELOG updated
Documentation updatedx (README)
Haddocks updated
No new TODOs introduced or explained herafter
- Two XXX notes of what to improve further

github-actions · 2024-10-08T18:15:06Z

Transaction costs

Sizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using arbitrary values and results are not fully deterministic and comparable to previous runs.

Metadata
Generated at	2024-10-10 10:31:51.922255033 UTC
Max. memory units	14000000
Max. CPU units	10000000000
Max. tx size (kB)	16384

Script summary

Name	Hash	Size (Bytes)
νInitial	b512161ccb0652d7e9a0b540e4a3c808f73d6558a4bcabf374d85880	3969
νCommit	ea444d37d226e71eef73ac78d149750da977feb588900135bf9e8221	692
νHead	2253ddd95837c7aacc8635a971caaea743434152dd8dd2849bdf4162	10797
μHead	4d648ca239040b0e87901835aa11423e7aa3bd947ce6befe7db1bae8*	4508
νDeposit	1a011f23b139a6426767026bde10319546485d553219a5848cdac4e5	2993

The minting policy hash is only usable for comparison. As the script is parameterized, the actual script is unique per head.

`Init` transaction costs

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	5096	5.79	2.29	0.44
2	5297	7.09	2.80	0.46
3	5499	8.73	3.46	0.49
5	5901	11.26	4.45	0.53
10	6907	18.11	7.16	0.65
57	16355	82.91	32.79	1.78

`Commit` transaction costs

This uses ada-only outputs for better comparability.

UTxO	Tx size	% max Mem	% max CPU	Min fee ₳
1	567	10.84	4.26	0.29
2	758	14.31	5.80	0.34
3	944	17.92	7.39	0.39
5	1323	25.56	10.73	0.49
10	2257	47.11	19.97	0.77
19	3947	94.71	39.81	1.38

`CollectCom` transaction costs

Parties	UTxO (bytes)	Tx size	% max Mem	% max CPU	Min fee ₳
1	57	560	20.58	7.85	0.40
2	113	671	28.02	10.67	0.48
3	171	782	37.34	14.18	0.59
4	227	893	47.04	17.86	0.70
5	282	1009	56.92	21.60	0.81
6	340	1116	67.08	25.44	0.92
7	396	1227	67.51	25.68	0.93
8	448	1338	80.43	30.57	1.08
9	504	1449	80.47	30.65	1.09

Cost of Decrement Transaction

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	650	18.40	8.07	0.39
2	731	18.50	8.83	0.40
3	858	19.95	10.17	0.42
5	1179	23.56	13.12	0.49
10	1989	32.96	20.62	0.65
47	7736	96.50	73.89	1.80

`Close` transaction costs

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	672	20.87	9.34	0.42
2	846	22.67	11.10	0.45
3	915	23.70	12.18	0.47
5	1199	26.64	15.09	0.53
10	1887	34.14	22.55	0.67
50	8222	99.29	86.57	1.94

`Contest` transaction costs

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	691	26.76	11.48	0.48
2	765	28.12	12.68	0.50
3	968	30.31	14.59	0.54
5	1204	33.94	17.72	0.61
10	2027	43.82	26.44	0.78
39	6521	99.67	75.65	1.79

`Abort` transaction costs

There is some variation due to the random mixture of initial and already committed outputs.

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	4971	15.30	6.55	0.54
2	5053	21.29	9.07	0.61
3	5180	25.54	10.89	0.66
4	5288	32.14	13.76	0.74
5	5680	44.07	19.41	0.90
6	5625	49.48	21.48	0.95
7	5852	56.62	24.78	1.04
8	6131	64.89	28.64	1.15
9	6024	65.89	28.54	1.15
10	6293	76.21	33.31	1.28
11	6475	87.40	38.25	1.42
12	6408	91.57	39.93	1.46
13	6565	96.51	41.91	1.52

`FanOut` transaction costs

Involves spending head output and burning head tokens. Uses ada-only UTxO for better comparability.

Parties	UTxO	UTxO (bytes)	Tx size	% max Mem	% max CPU	Min fee ₳
10	0	0	5090	10.38	4.35	0.49
10	1	57	5123	10.96	4.81	0.50
10	5	285	5260	15.83	7.79	0.57
10	10	570	5430	21.87	11.49	0.65
10	20	1138	5768	33.16	18.55	0.81
10	30	1710	6111	45.44	26.05	0.97
10	40	2279	6451	56.74	33.12	1.13
10	50	2848	6789	68.63	40.45	1.30
10	76	4321	7664	99.27	59.39	1.72

End-to-end benchmark results

This page is intended to collect the latest end-to-end benchmark results produced by Hydra's continuous integration (CI) system from the latest master code.

Please note that these results are approximate as they are currently produced from limited cloud VMs and not controlled hardware. Rather than focusing on the absolute results, the emphasis should be on relative results, such as how the timings for a scenario evolve as the code changes.

Generated at 2024-10-10 10:34:24.332196246 UTC

Baseline Scenario

Number of nodes	1
Number of txs	300
Avg. Confirmation Time (ms)	5.551930950
P99	12.216956419999997ms
P95	7.627868650000001ms
P50	5.3302375ms
Number of Invalid txs	0

Three local nodes

Number of nodes	3
Number of txs	900
Avg. Confirmation Time (ms)	24.358177392
P99	48.859335549999486ms
P95	32.6859822ms
P50	22.443132ms
Number of Invalid txs	0

github-actions · 2024-10-08T18:20:14Z

Test Results

544 tests ±0 538 ✅ ±0 26m 34s ⏱️ +4s
162 suites ±0 6 💤 ±0
7 files ±0 0 ❌ ±0

Results for commit ec21ac0. ± Comparison against base commit dff6655.

♻️ This comment has been updated with latest results.

hydra-cluster/bench/Bench/EndToEnd.hs

hydra-cluster/hydra-cluster.cabal

noonio

Minor comments; happy to merge if all the tests pass!

noonio · 2024-10-09T09:53:11Z

In fact the network tests are failing - https://github.com/cardano-scaling/hydra/actions/runs/11241196905/job/31255255123?pr=1693

This is not ideal, but a lot simpler than doing proper fee calculation. It's unclear why fee calculation was removed before, it is needed when running benchmark scenarios.

This is redundant and can be achieved by using the 'datasets' subcommand.

Before it was written to a random temporary directory, which makes it annoying to generate datasets with this mode.

They hydra-cluster benchmarks now only uses a single directory to store the whole state, which is temporary unless a specific output-directory is requested.

This reduces some code duplication without much loss of expressiveness (which key we use does not matter).

Same transaction style (single repending txs), but deliberately smaller length of transactions (3000 -> 300) to have shorter benchmark run-times, while sequence should be long enough to identify regressions. Generated with invocations: cabal run bench-e2e -- single --cluster-size 1 --scaling-factor 10 and cabal run bench-e2e -- single --cluster-size 3 --scaling-factor 10 Plus some manual amending of the JSON to contain a "title".

As before, the bench-e2e does not assume the hydra node keys to be seeded. This ties the way bench-e2e binary (which hard-codes Alice, Bob and Carol) to the configurable list of --hydra-client to connect to.

This decouples the bench-e2e binary which just produces load and provides statistics more from how the hydra-nodes are run. Now the only assumption is that the 'hydra-cluster/config/credentials/faucet.sk' owns funds on the given network.

ch1bo force-pushed the fix-bench-standalone branch 3 times, most recently from 506062b to 9eb745d Compare October 8, 2024 18:09

ch1bo self-assigned this Oct 8, 2024

ch1bo requested a review from a team October 8, 2024 18:11

ch1bo added the red bin label Oct 8, 2024

noonio reviewed Oct 9, 2024

View reviewed changes

hydra-cluster/bench/Bench/EndToEnd.hs Show resolved Hide resolved

noonio reviewed Oct 9, 2024

View reviewed changes

hydra-cluster/hydra-cluster.cabal Show resolved Hide resolved

noonio approved these changes Oct 9, 2024

View reviewed changes

noonio force-pushed the fix-bench-standalone branch 2 times, most recently from 0b81249 to 1dbd899 Compare October 9, 2024 12:29

ch1bo force-pushed the fix-bench-standalone branch from 1dbd899 to 5b34aa0 Compare October 9, 2024 12:32

ch1bo and others added 15 commits October 10, 2024 11:33

Fix hydra-cluster bench single by setting a hard-coded fee

dbc0d84

This is not ideal, but a lot simpler than doing proper fee calculation. It's unclear why fee calculation was removed before, it is needed when running benchmark scenarios.

Remove redundant bench-e2e mode of single + workdir set

ea1d38f

This is redundant and can be achieved by using the 'datasets' subcommand.

Write generated dataset to outputDirectory if given

d905ddb

Before it was written to a random temporary directory, which makes it annoying to generate datasets with this mode.

Write results.csv in --output-directory

8f878a0

They hydra-cluster benchmarks now only uses a single directory to store the whole state, which is temporary unless a specific output-directory is requested.

Small module re-org in HydraNode.hs

2b1dfe5

Drop need of party in benchmark scenario

215bc71

Use a bit more non empty lists in hydra-cluster

46ee433

Switch to only use self-transfers in benchmarks

a6f74a6

This reduces some code duplication without much loss of expressiveness (which key we use does not matter).

Separate client keys and hydra node keys

f4733d4

Fail if output directory is not empty

2c5d063

Start with an empty output directory

9317e88

Only show quantiles if they can be computed

b0bade5

Use and seed hydraNodeKeys in demo mode

5a4718c

As before, the bench-e2e does not assume the hydra node keys to be seeded. This ties the way bench-e2e binary (which hard-codes Alice, Bob and Carol) to the configurable list of --hydra-client to connect to.

ch1bo force-pushed the fix-bench-standalone branch from 4b50543 to a48d056 Compare October 10, 2024 09:33

ch1bo added this pull request to the merge queue Oct 10, 2024

Add a comment to the benchmark scenario

ec21ac0

ch1bo removed this pull request from the merge queue due to a manual request Oct 10, 2024

ch1bo enabled auto-merge October 10, 2024 10:25

ch1bo added this pull request to the merge queue Oct 10, 2024

Merged via the queue into master with commit 321167e Oct 10, 2024
25 of 28 checks passed

ch1bo deleted the fix-bench-standalone branch October 10, 2024 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bench-e2e single mode and keep results #1693

Fix bench-e2e single mode and keep results #1693

ch1bo commented Oct 8, 2024 •

edited

Loading

github-actions bot commented Oct 8, 2024 •

edited

Loading

github-actions bot commented Oct 8, 2024 •

edited

Loading

noonio left a comment

noonio commented Oct 9, 2024

Fix bench-e2e single mode and keep results #1693

Fix bench-e2e single mode and keep results #1693

Conversation

ch1bo commented Oct 8, 2024 • edited Loading

github-actions bot commented Oct 8, 2024 • edited Loading

Transaction costs

Script summary

Init transaction costs

Commit transaction costs

CollectCom transaction costs

Cost of Decrement Transaction

Close transaction costs

Contest transaction costs

Abort transaction costs

FanOut transaction costs

End-to-end benchmark results

Baseline Scenario

Three local nodes

github-actions bot commented Oct 8, 2024 • edited Loading

Test Results

noonio left a comment

Choose a reason for hiding this comment

noonio commented Oct 9, 2024

ch1bo commented Oct 8, 2024 •

edited

Loading

github-actions bot commented Oct 8, 2024 •

edited

Loading

`Init` transaction costs

`Commit` transaction costs

`CollectCom` transaction costs

`Close` transaction costs

`Contest` transaction costs

`Abort` transaction costs

`FanOut` transaction costs

github-actions bot commented Oct 8, 2024 •

edited

Loading