-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kafka check - first pass #810
Conversation
'activemq_58', | ||
'cassandra', | ||
'jmx', | ||
'kafka', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you should use the is_jmx flag in the init_config. I think that's what we should use for newer jmx checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is that read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what unit is kafka.net.bytes_in/kafka.net.bytes_out? I'm getting really low values (< 1) |
|
||
|
||
instances: | ||
# - host: localhost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, you should add the instances section on top of the file so we don't have to scroll all the way down to configure a check.
@conorbranagan @remh Added a kafka_consumer check which queries kafka and zookeeper for their offsets and computes lag: 9675070 |
…e there are partitions without consumers
@@ -0,0 +1,86 @@ | |||
from collections import defaultdict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaultdict is not in python 2.4, you should import from compat:
from compat import defaultdict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the libraries that this check depends on are extremely incompatible with 2.4, how should we handle that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two options:
- Wait for the self contained agent
- Catch the python version and raise an exception if < 2.6
@clofresh @conorbranagan That's great stuff! |
… rest of our code
Conflicts: jmxfetch.py
Thanks! |
Uses the standard JMX format to pull in metrics from Kafka.
We're missing metrics by topic which is currently a limitation of the JMX configuration. I think we'll have to add something comparable to how we post-process cassandra metrics (https://github.com/DataDog/jmxfetch/blob/master/src/main/java/org/datadog/jmxfetch/Reporter.java#L94-L104) because the Kafka metrics have the topic name in them. I think we'd also want to limit to a list of topics to handle the case where a person has hundreds of topics.
Here's how a per-topic metric looks in JMX as a bean:
kafka.server":type="BrokerTopicMetrics",name="check_runs-FailedFetchRequestsPerSec"