Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-557: add eBPF agent metrics for troubleshooting #582

Merged
merged 3 commits into from
Feb 28, 2024

Conversation

msherif1234
Copy link
Contributor

Description

adding promo svc to ebpf agent for debugging metrics support

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Feb 26, 2024

@msherif1234: This pull request references NETOBSERV-557 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Description

adding promo svc to ebpf agent for debugging metrics support

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

codecov bot commented Feb 26, 2024

Codecov Report

Attention: Patch coverage is 56.45933% with 91 lines in your changes are missing coverage. Please review.

Project coverage is 67.15%. Comparing base (f4d840b) to head (e0db291).
Report is 4 commits behind head on main.

Files Patch % Lines
controllers/ebpf/agent-metrics-test.go 0.00% 33 Missing ⚠️
...s/flowcollector/v1beta1/zz_generated.conversion.go 31.25% 18 Missing and 4 partials ⚠️
...pis/flowcollector/v1beta1/zz_generated.deepcopy.go 0.00% 16 Missing ⚠️
...pis/flowcollector/v1beta2/zz_generated.deepcopy.go 56.25% 7 Missing ⚠️
controllers/ebpf/agent_controller.go 79.41% 5 Missing and 2 partials ⚠️
controllers/ebpf/agent-metrics.go 91.78% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #582      +/-   ##
==========================================
- Coverage   67.56%   67.15%   -0.41%     
==========================================
  Files          69       71       +2     
  Lines        8192     8407     +215     
==========================================
+ Hits         5535     5646     +111     
- Misses       2317     2409      +92     
- Partials      340      352      +12     
Flag Coverage Δ
unittests 67.15% <56.45%> (-0.41%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jotak
Copy link
Member

jotak commented Feb 27, 2024

FYI I also added a follow-up to implement TLS: https://issues.redhat.com/browse/NETOBSERV-1532
SG depends on how valuable that feature is we can decided we needed the TLS pieces or not since its for debugging only and user will turn it on on demand I think adding TLS to the mix will be a bit too much

@msherif1234 msherif1234 force-pushed the agent-metrics branch 4 times, most recently from 51af226 to 35e2c12 Compare February 27, 2024 12:36
@msherif1234 msherif1234 requested a review from jotak February 27, 2024 12:37
@msherif1234 msherif1234 force-pushed the agent-metrics branch 2 times, most recently from aeed6d2 to 8905a3a Compare February 27, 2024 13:24
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 27, 2024
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:467ce2b
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-467ce2b
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-467ce2b

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:467ce2b make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-467ce2b

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-467ce2b
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

Signed-off-by: Mohamed Mahmoud <mmahmoud@redhat.com>
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 27, 2024
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 27, 2024
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:39ba465
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-39ba465
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-39ba465

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:39ba465 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-39ba465

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-39ba465
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 28, 2024
@jotak
Copy link
Member

jotak commented Feb 28, 2024

@msherif1234 I don't think this is for debugging only, all our other components provide some metrics even by default in production, TBH I don't think it's a big overhead but we can see & measure that. All these metrics are low cardinality they're just a bunch of float64's values fetched from time to time, it shouldn't be big

For now we can go ahead with that disabled by default but if it doesn't have a visible overhead we can think about enabling it by default

@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 28, 2024
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:255ef34
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-255ef34
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-255ef34

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:255ef34 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-255ef34

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-255ef34
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@msherif1234
Copy link
Contributor Author

/retest

@msherif1234
Copy link
Contributor Author

/test e2e-operator

@jotak
Copy link
Member

jotak commented Feb 28, 2024

/lgtm

@msherif1234
Copy link
Contributor Author

/approve

Copy link

openshift-ci bot commented Feb 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msherif1234

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 609eb0a into netobserv:main Feb 28, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved jira/valid-reference lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants