Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use same amount of messages for all consumer benchmarks #1382

Merged
merged 7 commits into from
Nov 23, 2024

Conversation

erikvanoosten
Copy link
Collaborator

@erikvanoosten erikvanoosten commented Nov 13, 2024

In this change common benchmark code is moved into ZioBenchmark, ConsumerZioBenchmark and ProducerZioBenchmark so that it becomes easier to make the different benchmark comparable.

After running the consumer benchmarks with different number of records, and different records sizes per run, this PR settled on 50k records of ~512 bytes per run for all consumer benchmarks. With these amounts the zio-kafka based benchmarks and the 'comparison' benchmarks have roughly the same scaling elasticity (where 'scaling elasticity' is defined as the throughput growth factor divided by the number of records growth factor).

After this PR is merged, the benchmark history will be rewritten with linear scaling so that we can compare historic runs against new runs.

@@ -21,7 +21,7 @@ trait ComparisonBenchmark extends ZioBenchmark[Env] {
protected final val nrPartitions: Int = 6
protected final val topicPartitions: List[TopicPartition] =
(0 until nrPartitions).map(TopicPartition(topic1, _)).toList
protected final val numberOfMessages: Int = 1000000
protected final val numberOfMessages: Int = 50000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still enough to get a stable benchmark result..?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most probably. We deemed it sufficient for the zio-kafka based benchmarks. Historically there is variation in the results, but I suspect its fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, it might be better to raise both to 100_000. I suspect the benchmark duration will not be hugely affected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100k messages does affect total runtime while ratio between the different benches does not change. So back to 50k.

@erikvanoosten
Copy link
Collaborator Author

This is quite amazing. Doubling the number of messages also almost doubles the throughput! (Both for zio-kafka and the direct consumers.)

That means that the benchmarks are not really representative. We probably need to increase the number of messages further. Also maybe make them larger.

@svroonland
Copy link
Collaborator

We could also look at latency metrics for various RPS loads

Benches take very long with large messages.
@erikvanoosten
Copy link
Collaborator Author

erikvanoosten commented Nov 16, 2024

Going from 100k 1024 byte records, to 50k 512 byte records make the benches 3 times slower (not 4 times as could be expected). This true for all consumer benchmarks. The benchmarks in total go from ~20 min to ~17m.

With 50k 512 byte records the benchmark workflow takes about as long as before (which has 50k 10 byte records).

I would like to merge this change and then rewrite bench history a bit so that it all stays (roughly) comparable.

Also: use `record` i.s.o. `message` for consistency with all other zio-kafka and kafka documentation.
@erikvanoosten erikvanoosten merged commit b713170 into master Nov 23, 2024
12 of 13 checks passed
@erikvanoosten erikvanoosten deleted the bench-equalizer branch November 23, 2024 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants