Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/mongodbreceiver] Add metric versioning checks to prevent partial errors #13266

Conversation

JonathanWamsley
Copy link
Contributor

Description:
In use or in the integration tests, many partial errors are being emitted. A metric versioning and storage engine check can be applied before metrics are recorded to prevent these partial errors from occurring.
Figuring what partial errors were slipping through was a pain so a more detailed partial error message and logging was added to prevent a generic could not find key for metric message.

Link to tracking Issue:
#13155

Testing:

  • added test case for triggering partial error when metric paths are not valid
  • added test case for triggering partial errors when a bad client response is received

Documentation:
The readme metrics section has mongodb version and storage engine requirements to get certain metrics.

Also while collecting mongodb.global_lock.time, it appears that mongodb 2.6, 3.0, 4.0, and 5.0 all have the metric available so the old mongodb version check can be removed since it is available for all versions.

// Mongo version greater than or equal to 4.0 have it in the serverStats at "globalLock", "totalTime"
// reference: https://docs.mongodb.com/v4.0/reference/command/serverStatus/#server-status-global-lock
mongo40, _ := version.NewVersion("4.0")
if s.mongoVersion.GreaterThanOrEqual(mongo40) {

@JonathanWamsley JonathanWamsley marked this pull request as ready for review August 12, 2022 17:07
@JonathanWamsley JonathanWamsley requested a review from a team August 12, 2022 17:07
Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot to parse through. Is there a way to break this up into smaller PRs?

@JonathanWamsley
Copy link
Contributor Author

Hey @djaglowski ,
I first added a partial error check shown in the issue to panic before emitting metrics to see the partial errors when running the integration tests. That was not very helpful as they were could not find key for metric; I added different logging comments to quickly figure out which metrics were causing the partial errors and fixed them. This may be where 2 prs can be made like at this stage

Ideally though I am not sure if that is the right way to do code development. I think adding in better logging and partial error messages w/ tests is what I should have done originally which is why I implemented that. That specific metric messaging allowed me to make meaningful partial error tests for the client metric response and for each metric/attribute.

Having a way to collect partial errors in the integration tests would also be great but the consumer package MetricsSink struct populates the metrics and which I don't think there is a way to get partial errors from there.

If you want, I can split the pr into pr1 with the metric version updates like in the above link. Then pr2 with warning/partial error messages and unit tests. thoughts?

@djaglowski
Copy link
Member

@JonathanWamsley, that sounds reasonable, as long as we end up in the same place. It'll help me be able to follow what you're doing though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants