-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use same amount of messages for all consumer benchmarks #1382
Conversation
@@ -21,7 +21,7 @@ trait ComparisonBenchmark extends ZioBenchmark[Env] { | |||
protected final val nrPartitions: Int = 6 | |||
protected final val topicPartitions: List[TopicPartition] = | |||
(0 until nrPartitions).map(TopicPartition(topic1, _)).toList | |||
protected final val numberOfMessages: Int = 1000000 | |||
protected final val numberOfMessages: Int = 50000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it still enough to get a stable benchmark result..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most probably. We deemed it sufficient for the zio-kafka based benchmarks. Historically there is variation in the results, but I suspect its fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, it might be better to raise both to 100_000. I suspect the benchmark duration will not be hugely affected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100k messages does affect total runtime while ratio between the different benches does not change. So back to 50k.
2cd592d
to
1859be6
Compare
This is quite amazing. Doubling the number of messages also almost doubles the throughput! (Both for zio-kafka and the direct consumers.) That means that the benchmarks are not really representative. We probably need to increase the number of messages further. Also maybe make them larger. |
We could also look at latency metrics for various RPS loads |
Also: - extract common test parameters
1859be6
to
00f8855
Compare
Benches take very long with large messages.
Going from 100k 1024 byte records, to 50k 512 byte records make the benches 3 times slower (not 4 times as could be expected). This true for all consumer benchmarks. The benchmarks in total go from ~20 min to ~17m. With 50k 512 byte records the benchmark workflow takes about as long as before (which has 50k 10 byte records). I would like to merge this change and then rewrite bench history a bit so that it all stays (roughly) comparable. |
zio-kafka-bench/src/main/scala/zio/kafka/bench/ZioBenchmark.scala
Outdated
Show resolved
Hide resolved
Also: use `record` i.s.o. `message` for consistency with all other zio-kafka and kafka documentation.
In this change common benchmark code is moved into
ZioBenchmark
,ConsumerZioBenchmark
andProducerZioBenchmark
so that it becomes easier to make the different benchmark comparable.After running the consumer benchmarks with different number of records, and different records sizes per run, this PR settled on 50k records of ~512 bytes per run for all consumer benchmarks. With these amounts the zio-kafka based benchmarks and the 'comparison' benchmarks have roughly the same scaling elasticity (where 'scaling elasticity' is defined as the throughput growth factor divided by the number of records growth factor).
After this PR is merged, the benchmark history will be rewritten with linear scaling so that we can compare historic runs against new runs.