-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-881: Added field to disable alerts creation #295
Conversation
@OlivierCazade: This pull request references NETOBSERV-881 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -53,6 +53,10 @@ func (r *FlowCollector) ConvertTo(dstRaw conversion.Hub) error { | |||
dst.Spec.Processor.ConnectionEndTimeout = restored.Spec.Processor.ConnectionEndTimeout | |||
} | |||
|
|||
if restored.Spec.Processor.Metrics.DisableAlerts != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity, did you have "something" telling you that you had something to do here, or was it something you had to keep in mind yourself? (trying to understand how error prone it can be if we forget about the conversion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After adding the new field in v1beta1 running make generate was triggering some warning because of the new fields.
But then I mostly looked at previous PR to understand how to do it.
api/v1beta1/flowcollector_types.go
Outdated
@@ -280,6 +280,10 @@ type FLPMetrics struct { | |||
// ignoreTags is a list of tags to specify which metrics to ignore | |||
//+kubebuilder:default:={"egress","packets"} | |||
IgnoreTags []string `json:"ignoreTags,omitempty"` | |||
|
|||
// disableAlerts is a list of alerts related to FLP that should not be created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what is the best is terms of usage here, a deny-list or an allow-list. Thoughts?
In any case we should document here what are the acceptable values, ie NetObservNoFlows
and NetObservLokiError
at the moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's good to use a deny-list as we want alerts, in general, being set by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as a default behavior we want to enable alerts. Using an allow list would make upgrade with new alert a bit more trickier since we would not know if it is still the default list and if we should add the new alert to the upgrade.
I will improve the documentation to mention the possibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I would remove the FLP mention, as it might not always be clear to the users when it's related to FLP or not (they don't necessarily know well our architecture)
// disableAlerts is a list of alerts related to FLP that should not be created | |
// disableAlerts is a list of alerts that should not be created. Possible values are: `NetObservNoFlows`, which is triggered when no flows are being observed for a certain period, and `NetObservLokiError`, which is triggered when flows are being dropped due to Loki errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @skrthomas could you please also review the CRD doc added here?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added your change but also moved to an enum so we can control the potential values.
Codecov Report
@@ Coverage Diff @@
## main #295 +/- ##
==========================================
- Coverage 47.61% 47.57% -0.04%
==========================================
Files 43 43
Lines 5015 5036 +21
==========================================
+ Hits 2388 2396 +8
- Misses 2416 2427 +11
- Partials 211 213 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
4b89c94
to
316d507
Compare
api/v1beta1/flowcollector_types.go
Outdated
) | ||
|
||
// Name of a processor alert | ||
// Possible values are: `NetObservNoFlows`, which is triggered when no flows are being observed for a certain period, and `NetObservLokiError`, which is triggered when flows are being dropped due to Loki errors.disableAlerts is a list of alerts related to FLP that should not be created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The end of the comment seems wrong here: disableAlerts is a list of alerts related to FLP that should not be created
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The bit "related to FLP" could be unpacked a little. Could it be a list of alerts related to the health of flows in the FLP? 🤔 Also , a space is needed after the period and before disableAlerts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments. Mostly a couple of punctuation suggestions and also wondering about the same phrase Joel was asking about.
api/v1beta1/flowcollector_types.go
Outdated
) | ||
|
||
// Name of a processor alert | ||
// Possible values are: `NetObservNoFlows`, which is triggered when no flows are being observed for a certain period, and `NetObservLokiError`, which is triggered when flows are being dropped due to Loki errors.disableAlerts is a list of alerts related to FLP that should not be created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Possible values are: `NetObservNoFlows`, which is triggered when no flows are being observed for a certain period, and `NetObservLokiError`, which is triggered when flows are being dropped due to Loki errors.disableAlerts is a list of alerts related to FLP that should not be created | |
// Possible values are: `NetObservNoFlows`, which is triggered when no flows are being observed for a certain period, and `NetObservLokiError`, which is triggered when flows are being dropped due to Loki errors. disableAlerts is a list of alerts related to FLP that should not be created |
api/v1beta1/flowcollector_types.go
Outdated
) | ||
|
||
// Name of a processor alert | ||
// Possible values are: `NetObservNoFlows`, which is triggered when no flows are being observed for a certain period, and `NetObservLokiError`, which is triggered when flows are being dropped due to Loki errors.disableAlerts is a list of alerts related to FLP that should not be created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The bit "related to FLP" could be unpacked a little. Could it be a list of alerts related to the health of flows in the FLP? 🤔 Also , a space is needed after the period and before disableAlerts.
description: disableAlerts is a list of alerts that should | ||
not be created. | ||
items: | ||
description: 'Name of a processor alert Possible values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: 'Name of a processor alert Possible values | |
description: 'Name of a processor alert. Possible values |
api/v1beta1/flowcollector_types.go
Outdated
AlertLokiError = "NetObservLokiError" | ||
) | ||
|
||
// Name of a processor alert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Name of a processor alert | |
// Name of a processor alert. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as my comment below. A period is needed here.
7776ef4
to
59d9061
Compare
59d9061
to
70b7d52
Compare
@Amoghrd this PR is ready for testing, using the new field you should be able to see that the alert rule is not set. To see alert rule go to CustomResourceDefinition -> prometheusRules, there should be an instance called flowlogs-pipeline-alert and in the yaml of this instance you should have a list of the alerts under spec -> groups |
/ok-to-test |
New images:
They will expire after two weeks. Catalog source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-27d0843
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@skrthomas since there is both this PR and also Julien's one that require regenerating the doc, what I can do is once they are merged I'll open a new one with just the doc update to have all at once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, @jotak. lgtm!
/label qe-approved |
@OlivierCazade I'm merging this, I have a pending PR that conflicts with this one |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jotak The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Added disable list to not create alerts. Each alert listed in this field would not be created.