-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kafka unavailability blocks the event loop #480
Comments
@cristianrgreco Stumbled upon this while investigating similar issue. Have you tried configuring the
|
Kind of weird that the Kafka client supports async methods but they block |
Yes we ended up setting it to 3s. The result was the application wasn't completely unresponsive, but there's still a huge performance degradation. Our app receives a large, constant stream of messages that need to be published to Kafka, as well as handling online traffic, so when there are just a couple of threads in the event-loop, the 3s block is still a killer. |
@cristianrgreco I ended up "fixing" this by sending KafkaProducer#send calls as tasks to a ThreadPoolExecutorService with a bounded LinkedBlockingQueue. The ExecutorService#submit doesn't block, but of course can throw RejectedExecutionException if the ExecutorService's work queue is full. Now at least the blocking happens outside the event loop. We had a call to KafkaProducer#send in netty event loop, and basically after a few calls when the broker was down the whole event loop was blocked. Btw, you can of course always adjust the buffer.memory setting, which I think defaults to 32 MB. Then the only concern is if all the brokers go offline. |
Sounds like Kafka operations always have to be offloaded from the event loop. You can do this globally with: micronaut:
server:
thread-selection: IO
executors:
io:
type: fixed
nThreads: 75 # or whatever thread pool size works best for your app |
I (work with Cristian in the same team) did something similar that:
This solution kind of works but not ideal, as:
|
Expected Behavior
According to the documentation, defining a
KafkaClient
as follows:Should not block, even if Kafka is unavailable.
The documentation also states:
And we have tried this as well with no difference.
Actual Behaviour
When we try to publish when Kafka is unavailable, the event loop gets blocked, for up to the configured max-block (default 60s). We can confirm via a thread dump that all event-loop threads are
TIMED_WAITING
with the following:Looking into the Apache Kafka client internals, we see that both
awaitUpdate
andwaitOnMetadata
are blocking. The result is the application is unresponsive.Steps To Reproduce
KafkaClient
to consume all threads in the event-loop group.Environment Information
OS:
CentOS Linux 7 (Core)
JDK:
openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)
Micronaut Kafka modle:
io.micronaut.kafka:micronaut-kafka:3.3.3
Producer config:
Example Application
No response
Version
2.5.13
The text was updated successfully, but these errors were encountered: