Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metricbeat][Kubernetes] Update kube state metrics fetched metrics #34448

Merged
merged 25 commits into from
Feb 2, 2023
Merged

[Metricbeat][Kubernetes] Update kube state metrics fetched metrics #34448

merged 25 commits into from
Feb 2, 2023

Conversation

constanca-m
Copy link
Contributor

@constanca-m constanca-m commented Feb 1, 2023

What does this PR do?

  1. Add a new function to the test files that goes over the metrics of a KSM files and checks if it exists in there.
  2. Remove KSM metric files with old metrics.
  3. Add KSM metric files from versions supported by K8s versions.
  4. Remove deprecated metrics that don't cause any change on the mapped fields.

How to test this PR locally

  1. Clone this repo.
  2. Go to each metricset and just run go test .

Results prior to removing metrics

These results are from the new function. It highlights the metrics that are no longer present in any of the files. It does not check for labels, since they do not always appear (e.g. if the metric holds no value, then we have no access to the labels).

State container:

=== RUN   TestMetricsFamily
ktest.go:71: Unknown metric: kube_pod_container_resource_requests_cpu_cores
ktest.go:71: Unknown metric: kube_pod_container_resource_limits_cpu_cores
ktest.go:71: Unknown metric: kube_pod_container_status_restarts
ktest.go:71: Unknown metric: kube_pod_container_resource_requests_memory_bytes
ktest.go:71: Unknown metric: kube_pod_container_resource_limits_memory_bytes
--- FAIL: TestMetricsFamily (0.00s)

State node:

=== RUN   TestMetricsFamily
    ktest.go:71: Unknown metric: kube_node_status_ready
    ktest.go:71: Unknown metric: kube_node_status_allocatable_pods
    ktest.go:71: Unknown metric: kube_node_status_capacity_memory_bytes
    ktest.go:71: Unknown metric: kube_node_status_allocatable_cpu_cores
    ktest.go:71: Unknown metric: kube_node_status_capacity_pods
    ktest.go:71: Unknown metric: kube_node_status_allocatable_memory_bytes
    ktest.go:71: Unknown metric: kube_node_status_capacity_cpu_cores
--- FAIL: TestMetricsFamily (0.00s)

All other state metricsets don't use deprecated metrics.

Results post deleting those metrics

All tests should pass.

Related issues

Relates to #34313.

constanca-m and others added 19 commits January 31, 2023 09:35
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
.
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
@constanca-m constanca-m added the Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team label Feb 1, 2023
@constanca-m constanca-m self-assigned this Feb 1, 2023
@constanca-m constanca-m requested review from a team as code owners February 1, 2023 16:35
@constanca-m constanca-m requested review from tetianakravchenko, belimawr and leehinman and removed request for a team February 1, 2023 16:35
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 1, 2023
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 1, 2023
@mergify
Copy link
Contributor

mergify bot commented Feb 1, 2023

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @constanca-m? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

elasticmachine commented Feb 1, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-02-02T10:52:14.133+0000

  • Duration: 65 min 4 sec

Test stats 🧪

Test Results
Failed 0
Passed 4279
Skipped 876
Total 5155

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@MichaelKatsoulis
Copy link
Contributor

@constanca-m great! So it's just 10 fields that need removal. Go on to remove them.
About the test TestMetricsFamily I find it useful.
Everytime we upgrade the supported Kube-state-metric and the files, the test will tell us if we collect non-existing fields.

One thing we should consider is searching for fields that are marked as Deprecated. I can see 2

# HELP kube_endpoint_address_available (Deprecated since v2.6.0) Number of addresses available in endpoint.
# HELP kube_endpoint_address_not_ready (Deprecated since v2.6.0) Number of addresses not ready in endpoint

But we are not collecting them, so it is fine. I believe that in next major (ksm version 3) will be completely removed.
We could have a test that looks for fields marked as deprecated and check if we collect them. But it should not fail the test, just warn us.
Not sure how we are going to look for these warnings though, when the tests just run on CI.
@ChrsMark wdyt?

@constanca-m
Copy link
Contributor Author

constanca-m commented Feb 2, 2023

Maybe we will have to remove more fields, because we don't check for labels with that test. I will check if any of the labels we have are unique to the metrics we need to remove.
Having a warning for the deprecated fields seems interesting, but since we don't collect them, I don't know if it is worth it. We have a warning for fields that are not present in every file - this warning is just that, it does not cause the test to fail. We could use it to have a good idea of the deprecated fields. @MichaelKatsoulis

@constanca-m
Copy link
Contributor Author

constanca-m commented Feb 2, 2023

Just to give an example. If we add a new field field_not_present_in_all_files to one of the KSM files, but not to all of them, and, simultaneously, use this field in state_cronjob, the result of running that test would be:

=== RUN   TestMetricsFamily
    ktest.go:69: Warning: metric field_not_present_in_all_files is not present in all files.
--- PASS: TestMetricsFamily (0.00s)

So we know that something changed when versions were updated.

Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
@MichaelKatsoulis
Copy link
Contributor

Having a warning for the deprecated fields seems interesting, but since we don't collect them,

Yes there is no point in that. we should check for fields marked as deprecated and only in case some metricset collects them we should have a warning

Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
@MichaelKatsoulis
Copy link
Contributor

So we know that something changed when versions were updated.

So this test will catch cases where let's say we support ksm versions 2.8, 2.9 and 3.0, and there is a filed the we use but was removed in 3.0. It won't be present in all files. That should indicate us that it has maybe been substituted with another field and we need to start collecting it. If we ignore the warning, then when we will start supporting versions 3.0, 3.1, 3.2 the tests will fail because the filed won't be present in any file.
Now if we add the extra test with the marked as deprecated fields, then while we are supporting versions 2.7,2.8,2.9 we will be warned beforehand that some fields are going to be deleted permanently.
I think we are covered with all these tests.
The only thing that needs manual search is the addition of fields in newer versions. That's something we need to take a look from time to time as there may be interesting metrics that we can collect.

Copy link
Contributor

@tetianakravchenko tetianakravchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the linter before merging, I think it is a valid comment

Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
@constanca-m constanca-m added the backport-v8.6.0 Automated backport with mergify label Feb 2, 2023
@constanca-m constanca-m merged commit 3268f33 into elastic:main Feb 2, 2023
mergify bot pushed a commit that referenced this pull request Feb 2, 2023
…34448)

* Remove deprecated metrics

Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
(cherry picked from commit 3268f33)
@constanca-m constanca-m removed the backport-v8.6.0 Automated backport with mergify label Feb 2, 2023
@constanca-m constanca-m deleted the test-current-metrics branch February 21, 2023 13:46
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
…34448)

* Remove deprecated metrics

Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants