-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
record a metric with open census to ensure backwards compat #9065
record a metric with open census to ensure backwards compat #9065
Conversation
The healthcheck extension currently relies on an OpenCensus view to determine the health of the collector. This behaviour is not great as it only looks at span exports, but it exists today. To avoid breaking it, I suggest this PR that adds the same OC view even on the otel path, and records the same metric. Signed-off-by: Alex Boten <aboten@lightstep.com>
Signed-off-by: Alex Boten <aboten@lightstep.com>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #9065 +/- ##
==========================================
- Coverage 91.49% 91.48% -0.01%
==========================================
Files 316 316
Lines 17181 17201 +20
==========================================
+ Hits 15720 15737 +17
- Misses 1165 1167 +2
- Partials 296 297 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to fix this problem in the healthcheck extension instead? Like could it be made to check the featuregate and use otel metrics instead of opencensus if it is enabled?
Oh I see now that we already have a way forward for the healthcheckextension, and this is a hack until then |
@mwear @codeboten you got a link to the healthcheckextension issue that, when closed, means we can remove this hack? Can we also open an issue to track the removal of this hack? |
@TylerHelmuth This issue captures most of the problems w/ the healthcheck extension with this issue open-telemetry/opentelemetry-collector-contrib#26661 capturing the refactoring of it. I can create an issue to track removing this hack once open-telemetry/opentelemetry-collector-contrib#26661 is done |
@TylerHelmuth added #9069 |
Closing this PR. After further investigation, the behaviour in the current healthcheck extension doesn't work as expected. It registers itself as an OpenCensus exporter to be called when census wants to emit a metric. For the In addition, the extension does not read the value of the metric, which means the threshold it calculates is solely based on the number of times the view is exported, which has no correlation to failures to export. Since the view is configured only to record measurements for span exports, the healthcheck extension only ever triggered for a single signal, which doesn't make it overly useful. After discussing this with other collector approvers and maintainers, we've decided to flag the feature as not working as expected in the healthcheck extension. Since this extension is being revamped to use component status, it isn't worth fixing in the short term. I'll open a PR to update the README to warn users of this issue, this is only to prevent a breaking change multiple times in short successions for users. cc @TylerHelmuth @mx-psi |
This follows open-telemetry/opentelemetry-collector#9065 (comment) Signed-off-by: Alex Boten <aboten@lightstep.com>
This follows open-telemetry/opentelemetry-collector#9065 (comment) --------- Signed-off-by: Alex Boten <aboten@lightstep.com> Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
The healthcheck extension currently relies on an OpenCensus view to determine the health of the collector. This behaviour is not great as it only looks at span exports, but it exists today. To avoid breaking it, I suggest this PR that adds the same OC view even on the otel path, and records the same metric.
Note to reviewers: This is only a hack that's needed until the new healthcheck extension that uses component status is ready to go. cc @mwear