Event callbacks #137

edenhill · 2014-08-17T09:25:08Z

Events:

Topic state & leader change
Broker connection up/down (with total counts of brokers up/down)
..?

Callback set with rd_kafka_event_cb_set(rk, ..).
Served by rd_kafka_poll()

The text was updated successfully, but these errors were encountered:

rthalley · 2015-03-20T12:38:30Z

+1 on this. I want to be able to log state changes in topic+partition leadership, so I can say whether the topic+partition is "up" or "down".

DEvil0000 · 2016-01-15T15:36:44Z

+1
I need to stop/pause consumers depending on producer state
consumer->producer

edenhill · 2016-01-23T09:46:24Z

@DEvil0000 Pause/Resume is now available on master branch.

DEvil0000 · 2016-01-23T10:02:41Z

Thanks, is the state now also observable?

edenhill · 2016-01-23T10:13:35Z

No sir
Den 23 jan 2016 11:02 skrev "A. Binzxxxxxx" notifications@github.com:

Thanks, is the state now also observable?

—
Reply to this email directly or view it on GitHub
#137 (comment)
.

rolandyoung · 2017-03-09T14:45:01Z

+1 Some error states (such as misconfiguration of the broker URL) would cause our application log to be spammed rather enthusiastically with errors such as

FAIL: 127.0.0.1:10002/0: Connect to ipv4#127.0.0.1:10002 failed: Connection refused

To avoid this we use a heuristic approach: if we see an error, we set a state variable to 'in error' and increment an error count; if this reaches 10, we stop logging until we come out of the error state, which is the tricky bit. At the moment the best we have come up with is to watch the output queue (our application is a producer): if it has got shorter, or if it's empty and we have seen no new errors for a few seconds, we log that the error state is over and reset the error count. This mostly works, but is very occasionally prone to spurious flipping between states, especially if we are not producing any messages at the time.

We would find it very useful indeed to have a way of knowing whether the application is connected to at least one broker.

edenhill · 2017-03-09T14:49:26Z

Repeated errors for the same broker (and address!) should be suppressed, but if the broker resolves to multiple addresses or multiple address families (which localhost does on some OS:es, providing both ipv4 and ipv6 addresses and Kafka typically only listens to one family) there might be an endless ping-pong of logs since librdkafka round-robins all addresses for a broker.

We would find it very useful indeed to have a way of knowing whether the application is connected to at least one broker.

I'm guessing you want this to be push notifications (e.g., a callback or event served through ..poll()) rather than an API you need to call to check it, right?

If so, an event like thing, what would be triggering factors and what info should the event contain?
Trigger on: broker up & down
Info: The broker, number of brokers in up/down state?

rolandyoung · 2017-03-09T14:55:14Z

We could use either a push notification or a query API. If it was a push event, then the trigger and info you suggest would suit us very well.
Thanks for the extraordinarily quick response!

asharma339 · 2017-03-09T20:32:20Z

If all one needs is whether broker is up or down, one may call rd_kafka_metadata() to detect broker unreachable,which returns RD_KAFKA_RESP_ERR__TRANSPORT if broker is down. Is there anything wrong with that approach?

edenhill · 2017-03-09T20:34:11Z

@asharma339 This will do the trick, but the question is when to to do it and how often (there is a cost involved on the broker).

asharma339 · 2017-03-09T20:37:15Z

For consumer, I was thinking about doing every time when message fetched count is 0. For producer, it could be based on some more sophisticated criteria like rd_kafka_outq_len() is constantly increasing.

edenhill · 2017-03-09T20:46:46Z

Shouldn't ALL_BROKERS_DOWN be sufficient for that?

asharma339 · 2017-03-09T20:53:47Z

I never see ALL_BROKERS_DOWN or any other error while producing OR consuming if I deliberately added junk URLs in call to rd_kafka_brokers_add(). All I see is that if anywhere I call rd_kafka_metadata(), I get RD_KAFKA_RESP_ERR__TRANSPORT. Otherwise, all produce, consume calls succeed. Consumer poll simply retrieves 0 messages. My message delivery callback for producer never gets called with either success or failure. This is all C API.

edenhill · 2017-03-09T20:56:39Z

If you register an error_cb you should get an ALL_BROKERS_DOWN

asharma339 · 2017-03-09T21:17:28Z

Certainly. Since application would come to know about failures in an async manner, it would need some state to communicate it back to the main thread. Issue is specifically with producer which may want to retry on such failure. But probably failed message would be still in output queue and can be reprocessed if want to retry?

rolandyoung · 2017-03-10T08:55:39Z

Yes, the error_cb works well to let us know that things have gone bad and the output queue handles error recovery. Our problem is reliably detecting when the error has been recovered, in order to reset our logging state. Maybe calling rd_kafka_metadata() every few seconds until it succeeds would work for that - as long as it's failing it doesn't cost the brokers anything.

edenhill · 2017-03-10T08:56:23Z

@rolandyoung That sounds like a good idea

DEvil0000 · 2017-03-10T13:08:36Z

Metadata query has a quite high cost on the broker and network. Depending on the number of partitions and topics. Von meinem Samsung Gerät gesendet.

…

-------- Ursprüngliche Nachricht -------- Von: Roland Young <notifications@github.com> Datum: 10.03.2017 09:55 (GMT+01:00) An: edenhill/librdkafka <librdkafka@noreply.github.com> Cc: "A. Binzxxxxxx" <alexander@binzberger.de>, Mention <mention@noreply.github.com> Betreff: Re: [edenhill/librdkafka] Event callbacks (#137) Yes, the error_cb works well to let us know that things have gone bad and the output queue handles error recovery. Our problem is reliably detecting when the error has been recovered, in order to reset our logging state. Maybe calling rd_kafka_metadata() every few seconds until it succeeds would work for that - as long as it's failing it doesn't cost the brokers anything. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread. {"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/edenhill/librdkafka","title":"edenhill/librdkafka","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/edenhill/librdkafka"}},"updates":{"snippets":[{"icon":"PERSON","message":"@rolandyoung in #137: Yes, the error_cb works well to let us know that things have gone bad and the output queue handles error recovery. Our problem is reliably detecting when the error has been recovered, in order to reset our logging state. Maybe calling rd_kafka_metadata() every few seconds until it succeeds would work for that - as long as it's failing it doesn't cost the brokers anything."}],"action":{"name":"View Issue","url":"#137 (comment)"}}}

edenhill added the enhancement label Aug 17, 2014

edenhill self-assigned this Aug 17, 2014

edenhill mentioned this issue Apr 2, 2015

Question/Improvement: connected/connection lost event callback (c++) #231

Closed

edenhill mentioned this issue Apr 19, 2015

Cluster events #102

Closed

edenhill changed the title ~~Event callbacks queue~~ Event callbacks Apr 19, 2015

edenhill mentioned this issue Nov 11, 2015

Connection State #418

Closed

edenhill mentioned this issue Nov 3, 2016

Event callback was not called when connections are resumed #875

Closed

9 tasks

edenhill mentioned this issue Nov 11, 2016

How to detect if kafka is alive #894

Closed

9 tasks

edenhill mentioned this issue May 10, 2018

if there is Event/notify when all brokers are down or one of them is up #1801

Closed

edenhill mentioned this issue Nov 14, 2018

Low-level event callback API #2107

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Event callbacks #137

Event callbacks #137

edenhill commented Aug 17, 2014

rthalley commented Mar 20, 2015

DEvil0000 commented Jan 15, 2016

edenhill commented Jan 23, 2016

DEvil0000 commented Jan 23, 2016

edenhill commented Jan 23, 2016

rolandyoung commented Mar 9, 2017

edenhill commented Mar 9, 2017

rolandyoung commented Mar 9, 2017

asharma339 commented Mar 9, 2017 •

edited

Loading

edenhill commented Mar 9, 2017

asharma339 commented Mar 9, 2017 •

edited

Loading

edenhill commented Mar 9, 2017

asharma339 commented Mar 9, 2017 •

edited

Loading

edenhill commented Mar 9, 2017

asharma339 commented Mar 9, 2017 •

edited

Loading

rolandyoung commented Mar 10, 2017

edenhill commented Mar 10, 2017

DEvil0000 commented Mar 10, 2017 via email

Event callbacks #137

Event callbacks #137

Comments

edenhill commented Aug 17, 2014

rthalley commented Mar 20, 2015

DEvil0000 commented Jan 15, 2016

edenhill commented Jan 23, 2016

DEvil0000 commented Jan 23, 2016

edenhill commented Jan 23, 2016

rolandyoung commented Mar 9, 2017

edenhill commented Mar 9, 2017

rolandyoung commented Mar 9, 2017

asharma339 commented Mar 9, 2017 • edited Loading

edenhill commented Mar 9, 2017

asharma339 commented Mar 9, 2017 • edited Loading

edenhill commented Mar 9, 2017

asharma339 commented Mar 9, 2017 • edited Loading

edenhill commented Mar 9, 2017

asharma339 commented Mar 9, 2017 • edited Loading

rolandyoung commented Mar 10, 2017

edenhill commented Mar 10, 2017

DEvil0000 commented Mar 10, 2017 via email

asharma339 commented Mar 9, 2017 •

edited

Loading

asharma339 commented Mar 9, 2017 •

edited

Loading

asharma339 commented Mar 9, 2017 •

edited

Loading

asharma339 commented Mar 9, 2017 •

edited

Loading