Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event interface #273

Open
vojtechkral opened this issue Jun 26, 2020 · 5 comments
Open

Event interface #273

vojtechkral opened this issue Jun 26, 2020 · 5 comments

Comments

@vojtechkral
Copy link
Contributor

Hi,
we're using rust-rdkafka at @braiins . However, we found out that we probably wouldn't like to use the interface as is because of the requirement to spin up threads per stream and/or polling. We decided to fork it internally and switch to the event interface of librdkafka, which seems to be pretty efficient and goes quite well with the Rust's Futures model. At least a lot better compared to the callback interface, which is kind of hard to hook up with Futures/Streams without additional threads and/or long polling. I have written support for the evented interface in rust-rdkafka for our internal fork and after that we observed a multiple-times improvement in performance.

I noticed there is some work being done to get rid of the threads, however, I didn't study it in detail and if I'm looking right the evented interface is not being used...

I wanted to ask:

  • Is there a technical reason why the evented interface is not used in rust-rdkafka? Is there some kind of a downside I don't see?
  • Would you be interested in accepting code from us? I would probably need to do some cleanups, fill in documentation etc. for public usage, however, I don't want to spend time on that unless there is interest in it :-)

Thank you and thanks for making rust-rdkafka,
Vojtech

@fede1024
Copy link
Owner

Hello!

I haven't worked on the library for a while (@benesch has been doing all the work for the one 1+ years!), but I remember that moving to the event-based librdkafka interface was one of the goals I had in mind and I was actively experimenting on ~2 years ago, so I'd definitely be interested in the change you're proposing. I can't think of any downside, but again, my mental model of the library is a bit stale.

I imagine that the rust-rdkafka public interface would remain the same?

I leave it to @benesch to comment whether he'd have time to review this change or not, since it's probably going to be a significant amount of work.

@benesch
Copy link
Collaborator

benesch commented Jul 8, 2020

Apologies for the delayed response—things have been busy at work lately!

I can't promise that I'll have time to review any large changes in the short term, but hopefully in the medium-to-long term. I'm definitely interested in anything that can either simplify the implementation of rust-rdkafka or improve its performance.

That said, I'm not too familiar with librdkafka's event API. To make sure we're on the same page: you're talking about rd_kafka_conf_set_background_event_cb and friends, right? I'm not totally sure how that would fit in to the existing StreamConsumer and FutureProducer—both of those have fairly efficient implementations now, and the StreamConsumer doesn't even need a separate thread. But I can definitely believe that there are simplifications and performance improvements to be had.

So, would definitely be interested to hear more about your approach! I think no matter what we'll want to retain the polling in the BaseConsumer and BaseProducer, but there's no reason that the StreamConsumer and FutureProducer couldn't get a more efficient event-based implementation.

@vojtechkral
Copy link
Contributor Author

Apologies for the delayed response—things have been busy at work lately!

Completely understandable, I know the feeling :)

That said, I'm not too familiar with librdkafka's event API.

Well librdkafka's documentation is less clear/elloquent about the event interface, it took me some headscratching & hokuspokus to grok it...

To make sure we're on the same page: you're talking about rd_kafka_conf_set_background_event_cb and friends, right?

I'm using the rd_kafka_queue_cb_event_enable() function to register a small function as an event callback - the only thing it does is that it calls a Waker that wakes a stream waiting on the consumer/producer.

I've modified the StreamConsumer to do this. I haven't modified any producer and instead created an AsyncProducer type which does the analogous thing. Both StreamConsumer and AsyncProducer have a function called event_stream() which returns an EventStream type which implements the stream, kind of like MessageStream that you already have, except it yields events and doesn't do any polling. The event types for consumer and producer differ somewhat (consumer has fetch events, producer DR events).

Another thing about this approach is that basically I don't need the Context types, my StreamConsumer type has no generic argument. The data is instead contained in the events. Although honestly I've so far only written support for the event types that I acutally use (we don't use events like rebalance or stats so far).

I'll try to share the code somewhere for peeking. Perhaps I could somehow divide the changes into smaller chunks so that they could be easier to review...

@benesch
Copy link
Collaborator

benesch commented Jul 9, 2020

Well librdkafka's documentation is less clear/elloquent about the event interface, it took me some headscratching & hokuspokus to grok it...

No kidding. It's hardly mentioned in issue reports or the wiki either.

I think the best path forward here would be to introduce your consumer as a new consumer type alongside your new producer. (AsyncProducer isn't quite different enough from FutureProducer IMO, but we can bikeshed the names later!) There's enough code written around the current model, plus it's the model that people familiar with librdkafka (or its bindings in other languages) will be used to, that I think we'll want to preserve it as an option.

@davidblewett
Copy link
Collaborator

@vojtechkral I'd be very interested to see what you have. Can you make it available somewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants