-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Runloop might be suspended when assigned with many partitions #986
Fix: Runloop might be suspended when assigned with many partitions #986
Conversation
- The Runloop command queue is a bounded queue, with a fixed size of 1024 - The Runloop stream takes commands from the queue, process them and offer a new `Poll` command into the stream (when applicable). - The `.offer` can suspend the fiber when the command queue is already full. this, in turn, makes the entire stream freeze. the underlying kafka consumer will eventually (after `max.poll.interval.ms`) be dropped from the group without any notification to the zstream. - to fix this, we can change the command queue from bounded to unbounded which will not suspend the fiber that adds the `Poll` command
👍 As expressed on Discord, unbounded memory usage is something to be avoided in general, but in this case it's preferrable over freezing. Should this lead to trouble, it's at least easily diagnosed as OOM errors. |
@svroonland any idea why the benchmark fails? it fails on lack of permissions |
@yaarix benchmarks don't work on forks |
@yaarix It's been a long time since this issue, but have you actually experienced this issue or was it just a theoretical one? |
I have experienced it and merged a fix a long time ago |
Thanks for the quick response. Alright, then we'll cancel plans for #1386 (completely forgot about this issue here) unless we have a good idea what exactly caused it and how we can mitigate that. |
Poll
command into the stream (when applicable)..offer
can suspend the fiber when the command queue is already full. this, in turn, makes the entire stream freeze. the underlying kafka consumer will eventually (aftermax.poll.interval.ms
) be dropped from the group without any notification to the zstream.Poll
command