-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is kafka-go very slow compared to Sarama? #417
Comments
I wrote the following benchmark test to confirm the findings of mine.
When bechmark was run for 10s using
|
try fix to
i meen BatchTimeout and BatchTimeout |
Hi @aneeskA. Keep in mind that kafka-go and sarama have very different APIs. The The I think that you'll find that if you change your code to account for the synchronous logic, it will run very quickly:
|
@stevevls Thanks! This is very helpful. Is there an async version of |
When you create the Lines 115 to 119 in efce7b6
WriteMessaages .
Glad you enjoy using |
Describe the bug
I am trying to send 100GB of data into kafka topic by breaking the 100GB into batches of 100 lines.
Using kafka-go, I see that it writes 1 message per second, as. per : https://www.gitmemory.com/issue/segmentio/kafka-go/326/519375403 .
To over this issue, I created a go routine for each write. This immediately improved the throughput.
But the application was quickly killed by OOM Killer since the data to be written was created faster than writing to kafka and the data that was accumulated with kafka-go exhausted the memory.
But when I did the same experiment using sarama, 100GB of data was moved to kafka in 2hr10mins. There was no concurrent routines to do the writes. It was done one after the other.
Why is this so? Is there an example in kafka-go to move high volume data with high throughput?
The text was updated successfully, but these errors were encountered: