Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use synchronous decompression #301

Closed
wants to merge 1 commit into from

Conversation

OndroNR
Copy link

@OndroNR OndroNR commented Jan 8, 2016

This fixes #298 which was caused by asynchronous nested message set decompression which resulted in out of order message consuming. Kafka returns messages in MessageSet which contains nested compressed MessageSets of 2000 messages (in my testing). Unfortunately decompression was asynchronous without any synchronisation, so messages were processed out of order in batches of max 2000 (depends on maxTickMessages). My fix applies synchronous decompression.

@estliberitas
Copy link
Contributor

@OndroNR To say, replacing async with async-like sync is no good in general. This is more about how kafka-node does decompression, is it queued or not. And I guess, if you have such a problem, then it does something wrong - w/o queueing.

@estliberitas
Copy link
Contributor

More interested how this is implemented in Java version.

@OndroNR
Copy link
Author

OndroNR commented Jan 16, 2016

@estliberitas would queued async decompression, which adds complexity, come with benefit here? Async decompression in kafka-node is missing queueing, resulting in loosing message order guarantee. Decompression is not I/O dependent, so I don't see any benefit in async.

@estliberitas
Copy link
Contributor

@OndroNR The benefit is not-blocking JS thread. I was trying to tell, that this is not the best solution, just trashing this library with simple workaround which should be solved in a more clean way. And sure with more work, time, etc.

Node.js has zlib.gunzip and zlib.gunzipSync not w/o reason. I'd like to know what's the time of event loop tick when 2-5k messages are processed synchronously.

Let's say someone has application with many-many Websocket connections requiring keep-alive packets to be sent "very often". And then we have a bunch of blocked-enough event loop ticks. And then people will start to profile. And they will see how much time is spent on zlib.gunzipSync. This scenario is not that real, yep, but it's very close to real situation.

Update: another one thing to consider: nodejs/node#1479 (comment). When blocked too long, GC can be called to rarely to collect everything.

@OndroNR
Copy link
Author

OndroNR commented Feb 22, 2016

@estliberitas that make sense. My use case does not suffer from this problem. I will keep this PR here for reference.

@aikar
Copy link
Contributor

aikar commented Aug 17, 2018

This PR can be closed. I've identified the bug that causes messages to be out of order, and it's simply a problem with the decoders logic. Asynchronous can work and still come in order.

I'm going to try to PR a fix next week.

@OndroNR
Copy link
Author

OndroNR commented Aug 20, 2018

This bug is still present in library after such time? We are not consuming Kafka in node.js anymore. I just kept this PR open for reference. I will close it when you create PR.

@aikar
Copy link
Contributor

aikar commented Aug 20, 2018

@OndroNR might as well go ahead and close it since this PR isn't even mergeable anymore.

@OndroNR
Copy link
Author

OndroNR commented Aug 20, 2018

@aikar Alright, closing

@OndroNR OndroNR closed this Aug 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong message order
3 participants