Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Idle query and error rate metrics #534

Merged

Conversation

apurvam
Copy link
Contributor

@apurvam apurvam commented Dec 14, 2017

No description provided.

@apurvam apurvam changed the base branch from 4.0.x to master December 14, 2017 19:35
Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @apurvam, just left one comment about tests

@@ -41,8 +48,86 @@ public KsqlEngineMetrics(String metricGroupPrefix, KsqlEngine ksqlEngine) {
this.ksqlEngine = ksqlEngine;

this.metricGroupName = metricGroupPrefix + "-query-stats";
this.numActiveQueries = metrics.sensor(metricGroupName + "-active-queries");
numActiveQueries.add(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests for this class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is basically glue code: it creates a bunch of sensors and then just calls 'record' on them. All the logic is in MetricCollectors. The corresponding tests are in MetricCollectorsTest.

However, since it does manage sensor lifecycle, I added a test to ensure that all the sensors owned by the class are removed on close.

Let me know if there are any other productive tests which can be written for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I thought about it a bit more and think that adding tests for the plumbing can be useful.. one can change metric names, etc. and having tests to catch it useful, because those changes will eventually break the dashboards built on the metrics.

I added a bunch of test cases to make sure that the metrics are being reported with the expected names. If the metric name conventions change in the future, we will catch them.

Apurva Mehta added 3 commits December 15, 2017 13:52
1. Number of idle queries (ie. those which have no throughput)
2. min/max/avg messages consumed/s across all queries
3. error rate aggregated across all queries in the engine.
@apurvam apurvam force-pushed the idle-query-and-error-rate-metrics branch from f23fa85 to 543f322 Compare December 15, 2017 21:55
Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @apurvam, LGTM

@apurvam apurvam requested a review from rodesai December 19, 2017 23:55
@apurvam
Copy link
Contributor Author

apurvam commented Dec 19, 2017

ping @hjafarpour @rodesai

Copy link
Contributor

@hjafarpour hjafarpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I will have to sit down with you to get more details on this for my understanding.

@apurvam apurvam merged commit 1c5ece1 into confluentinc:master Dec 20, 2017
@apurvam apurvam deleted the idle-query-and-error-rate-metrics branch December 20, 2017 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants