Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: MetricsTracer implementation #2421

Merged
merged 22 commits into from
Jan 30, 2024
Merged

feat: MetricsTracer implementation #2421

merged 22 commits into from
Jan 30, 2024

Conversation

ddixit14
Copy link
Contributor

@ddixit14 ddixit14 commented Jan 25, 2024

This PR adds framework-neutral MetricsTracer to Gax. This class adds logics during the start and end of an RPC call to calculate the latencies and count the number of RPC calls. Metric recording responsibilities are delegated to the MetricsRecorder, providing flexibility for diverse observability frameworks.
Unit tests are added for every implemented method, including tests for MetricsTracerFactory (class responsible for creating one MetricsTacer per operation).

@ddixit14 ddixit14 requested a review from a team as a code owner January 25, 2024 17:08
@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Jan 25, 2024
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Jan 25, 2024
@ddixit14 ddixit14 changed the title feat: Opentelemetry implementation feat: MetricsTracer implementation (Opentelemetry) Jan 26, 2024
@lqiu96
Copy link
Contributor

lqiu96 commented Jan 29, 2024

Thanks @ddixit14, I think it's looking good. I have a general question about the recording the attempts.
Is there a reason why attemptStarted() doesn't record/ increment the number of attempts (i.e. runs metricsRecorder.recordAttemptCount(1, attributes);). I would assume that starting the attempt would record an attempt count.
I'm guessing there is a reason and I'm not too familiar with why, but it seems like that might be a better spot than having to do that in attemptSucceeded and attemptCancelled and attemptFailed, etc.

According to me, if we record the attempt count when the attempt started, we cannot deduce its attributes (did it succeed/fail/got cancelled?). Attributes should be recorded only when we know the final status of the attempt. If we do it in attemptStarted(), there would be no "status" attributes to record. Does it make sense?

I see. I think given the current interface of MetricsRecorder this makes sense. I wonder if MetricsRecorder should be enhanced with some methods like:

  1. recordOperationResult
  2. recordAttemptResult

I don't have a list of pro/cons with this, but it seems cleaner (?) or at least makes sense in my head. I think that recording the attempt should occur when the attempt started and not be dependent on getting a result. Given that we need to record the status, maybe that can be done separately from recording the attempt count?

This probably doesn't need to be fixed now (or fixed at all), but just throwing some thoughts out.
CC: @blakeli0

@ddixit14
Copy link
Contributor Author

Thanks @ddixit14, I think it's looking good. I have a general question about the recording the attempts.
Is there a reason why attemptStarted() doesn't record/ increment the number of attempts (i.e. runs metricsRecorder.recordAttemptCount(1, attributes);). I would assume that starting the attempt would record an attempt count.
I'm guessing there is a reason and I'm not too familiar with why, but it seems like that might be a better spot than having to do that in attemptSucceeded and attemptCancelled and attemptFailed, etc.

According to me, if we record the attempt count when the attempt started, we cannot deduce its attributes (did it succeed/fail/got cancelled?). Attributes should be recorded only when we know the final status of the attempt. If we do it in attemptStarted(), there would be no "status" attributes to record. Does it make sense?

I see. I think given the current interface of MetricsRecorder this makes sense. I wonder if MetricsRecorder should be enhanced with some methods like:

  1. recordOperationResult
  2. recordAttemptResult

I don't have a list of pro/cons with this, but it seems cleaner (?) or at least makes sense in my head. I think that recording the attempt should occur when the attempt started and not be dependent on getting a result. Given that we need to record the status, maybe that can be done separately from recording the attempt count?

This probably doesn't need to be fixed now (or fixed at all), but just throwing some thoughts out. CC: @blakeli0

That's exactly what came to my mind when I thought about multiple attempts for a successful operation, should we keep a track of attempt_count somehow, with a counter maybe? But then I realized that we are calculating the metrics for each attempt, so keeping a track of number of attempts is irrelevant for us. For every attempt that resulted in something, we just increment the count by 1 with the corresponding attributes.

@blakeli0
Copy link
Collaborator

We don't have to mention OpenTelemetry in the title, as this implementation is framework neutral. Please add more details to the description as well.

@ddixit14 ddixit14 changed the title feat: MetricsTracer implementation (Opentelemetry) feat: MetricsTracer implementation. Jan 29, 2024
@ddixit14 ddixit14 changed the title feat: MetricsTracer implementation. feat: MetricsTracer implementation Jan 29, 2024
@ddixit14
Copy link
Contributor Author

We don't have to mention OpenTelemetry in the title, as this implementation is framework neutral. Please add more details to the description as well.

Understood, changed title and added a detailed description.

Copy link
Contributor

@lqiu96 lqiu96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left a few nits if you could take a quick look at.

@ddixit14
Copy link
Contributor Author

LGTM. I left a few nits if you could take a quick look at.

All changed and pushed. thank you.

Copy link

sonarcloud bot commented Jan 30, 2024

Quality Gate Failed Quality Gate failed for 'gapic-generator-java-root'

Failed conditions

0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarCloud

Copy link

sonarcloud bot commented Jan 30, 2024

Quality Gate Failed Quality Gate failed for 'java_showcase_integration_tests'

Failed conditions

0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarCloud

@ddixit14 ddixit14 merged commit 5c291e8 into main Jan 30, 2024
35 of 39 checks passed
@ddixit14 ddixit14 deleted the otel-impl-ddixit branch January 30, 2024 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants