Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delivery callbacks for batch production #88

Closed
chessai opened this issue Mar 15, 2019 · 6 comments
Closed

delivery callbacks for batch production #88

chessai opened this issue Mar 15, 2019 · 6 comments

Comments

@chessai
Copy link
Contributor

chessai commented Mar 15, 2019

when producing a message batch, there's no way to have a delivery callback that operates on the entire batch. currently callbacks only work on a single DeliveryReport, which only assumes single-message. Perhaps I'd like to know about the individual messages, but perhaps I'd also like the callback to tell me what the size of the batch was, or maybe even more sophisticated statistics? i can do this before i send it, but most of the time i don't care what these statistics are, and i would rather only see them upon some kind of delivery failure.

@AlexeyRaga
Copy link
Member

Not that I know it for sure, but if I remember correctly there is no such thing as a batch callback in librdkafka.

We can set (and we do) this one:

void rd_kafka_conf_set_dr_cb(rd_kafka_conf_t *conf,
			      void (*dr_cb) (rd_kafka_t *rk,
					     void *payload, size_t len,
					     rd_kafka_resp_err_t err,
					     void *opaque, void *msg_opaque));

which gives us an info about individual messages, but at this time it doesn't have any information about original batches.

This is probably due to async communication between librdkafka and the broker: librdkafka has its own internal buffer, and a retry policy etc.
Even if we use rd_kafka_produce_batch, I believe that internally it still can treat messages independently (because they can go to different partitions), so there is no such a thing as "send a batch - get one callback" on a librdkafka level.
At least to my knowledge.

Also, we have a brief discussion about rd_kafka_produce_batch being limited to certain extent, and then I kind of gave up on it:
image

@chessai
Copy link
Contributor Author

chessai commented Mar 16, 2019

Ah, I see. That's a little inconvenient. I'm curious: why is batch production pointless without "extreme throughput"? What does @edenhill classify as "extreme"? That would be useful to know, because at work we may end up switching to single-message depending on our needs. Batch production was just slightly easier in our case.

@AlexeyRaga
Copy link
Member

OK, here is my understanding.

I think it just means that librdkafka is so fast even with sending single messages so it is unlikely to be a bottleneck in real life. Single-message is only an API frontend, internally librdkafka is still batching requests to brokers. And sending a message is never a blocking operation: librdkafka just puts a message into its internal queue and releases control immediately.
That's why we need to watch for the delivery reports: Saying send in the client doesn't actually mean that the message has actually been sent over the wire and accepted by the Kafka Cluster.

As I said, librdkafka optimises for throughput and will internally batch messages and send to the brokers, so it doesn't make much difference if we enqueued messages individually.
One difference with batch-producing is that when you produce a batch and you specify the partition ID for that batch, it runs "faster" because it doesn't need to run partitioning algorithm on each of these messages (hashing message keys and potentially routing to different leaders) - the whole batch goes at once.
However, when you don't specify the partition, or make it randomly assigned, the partitioner will still run against every message in a batch.

There could be other micro-optimisations for batches. But the point is that when we do other stuff in our applications, it should hardly be the case that we are limited by the difference between single- and batch- producing.
I don't know the numbers, but I measured it once for one of my apps, and it indeed didn't make any difference at all. To be honest, I didn't expect it to be different since that app wouldn't saturate the upstream to Kafka, not even close (wouldn't produce that many messages).

So I decided that if I need to push many millions of messages as quick as possible, I'll re-visit batching and measure it again to see what batching really squeezes out of it. But I haven't been to that situation yet :)
And working with individual messages is more convenient so I never touched batch-producing since...

@AlexeyRaga
Copy link
Member

AlexeyRaga commented Mar 18, 2019

@chessai if you have big throughputs, it would be interesting to see the difference between batching and single-messages. If you end up measuring it, please let me know about your results! :)

@edenhill
Copy link

@AlexeyRaga is completely right, the batch API is just a slightly more optimized enqueuing interface for some specific use-cases (large chunks of pre-partitioned messages for instance). For the common case (use partitioner) it is not any faster than the single-message API, and lacks support for new field types such as headers and timestamps (see rd_kafka_producev()).

@chessai
Copy link
Contributor Author

chessai commented Mar 18, 2019

@AlexeyRaga @edenhill awesome, thanks for the thorough explanations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants