-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-1522 Open Telemetry exporter API #671
Conversation
Skipping CI for Draft Pull Request. |
TLS ClientTLS `json:"tls"` | ||
|
||
// Rules to transform flow logs to Opentelemetry-conformant format. | ||
Rules []api.GenericTransformRule `json:"rules,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we expose rules or hardcode them ?
See netobserv/flowlogs-pipeline#531 for FLP implementation
41f1e80
to
6a1582e
Compare
fromStage.EncodeOtelMetrics(fmt.Sprintf("%s-metrics", name), api.EncodeOtlpMetrics{ | ||
OtlpConnectionInfo: &conn, | ||
Prefix: constants.OperatorName, | ||
Metrics: flpMetrics, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
☝️ we should probably adapt the metrics as same as logs to match Open Telemetry field namings
For now I just got rid of the transform stage and kept the same metrics as prometheus. Let me know your thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, that means to rewrite flpMetrics
to map all the labels and the filters with the otel semantic.
That can be done in a follow-up I guess, if you prefer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I can add a todo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 6b556fb
|
||
// Type of Open Telemetry connection. The available options are `http` and `grpc`. | ||
// +optional | ||
Type string `json:"type"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename protocol
?
That's the term used in otel: https://opentelemetry.io/docs/collector/configuration/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes !
|
||
// Time duration of no-flow to wait before deleting data item | ||
// +kubebuilder:default:="2m" | ||
ExpiryTime *metav1.Duration `json:"expiryTime,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same setting exists for regular prom metrics but we chose not to expose it as this is more of an implementation detail. I think we should do the same here ;
or if we really want to expose it, then it might be better to set as an advanced setting, and even share the same between prom & otel configs (because at the end of the day this is really the same thing: how long do we keep in memory metrics that reference elements that may disappear, such as those having a pod name label and that pod doesn't exist anymore)
But for simplicity I wouldn't even bother about that, and just remove this setting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I'll remove it then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep the pushTimeInterval
?
|
||
// Prefix added to each metric name | ||
// +optional | ||
Prefix string `json:"prefix,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like "expiryTime", there's also a "prefix" config that we force as "netobserv" for regular metrics, and I wonder if we really should expose it, or keep being more opinionated. I tend to prefer being more opinionated, and expose less config, at least as long as nobody has a need for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I can remove it and force netobserv
here then
@@ -16,6 +16,7 @@ limitations under the License. | |||
package v1beta2 | |||
|
|||
import ( | |||
"github.com/netobserv/flowlogs-pipeline/pkg/api" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not have coupling with the FLP api here, the config definitions should be duplicated (or if we force the semantic convention then we can just remove that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I duplicated the GenericTransformRule
Addressed feedback & updated go vendor |
As this PR is still waiting for review, I took the chance to do the metrics transformation and added testing |
TargetPort int `json:"targetPort"` | ||
|
||
// Protocol of Open Telemetry connection. The available options are `http` and `grpc`. | ||
// +optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be an enum I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, efdb0d4
// Custom rules to transform flow logs to Opentelemetry-conformant format. | ||
// By default the following format will be used: https://github.com/rhobs/observability-data-model/blob/main/network-observability.md#format-proposal | ||
// Setting `customRules` will fully override the default rules allowing you to export each field to your naming convention | ||
// +optional | ||
Rules *[]GenericTransformRule `json:"customRules,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general we try to make the CRD more opinionated than what FLP exposes. Here, I think we should present that more clearly as a mapping of fields, and tell why we expose it (because there's no standard)
Here's a proposal:
// Custom rules to transform flow logs to Opentelemetry-conformant format. | |
// By default the following format will be used: https://github.com/rhobs/observability-data-model/blob/main/network-observability.md#format-proposal | |
// Setting `customRules` will fully override the default rules allowing you to export each field to your naming convention | |
// +optional | |
Rules *[]GenericTransformRule `json:"customRules,omitempty"` | |
// Custom fields mapping to an OpenTelemetry conformant format. | |
// By default, NetObserv format proposal is used: https://github.com/rhobs/observability-data-model/blob/main/network-observability.md#format-proposal . | |
// As there is currently no accepted otlp standard for L3/4 network logs, you can freely override it with your own. | |
// +optional | |
FieldsMapping *[]GenericTransformRule `json:"fieldsMapping,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also done in efdb0d4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, lgtm!
/ok-to-test |
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:cd31dfd make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-cd31dfd Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-cd31dfd
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
Fixed empty metrics in 9c9c293 |
/ok-to-test |
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:809a682 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-809a682 Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-809a682
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:eb79fe5 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-eb79fe5 Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-eb79fe5
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
Fix the value key mentionned in otel metrics config |
/ok-to-test |
/label qe-approved |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description
Add Open Telemetry to FlowCollector exporters:
Custom transform rules can be applied using
customRules
field inFlowCollectorOpenTelemetry
exporter. Else default rules from https://github.com/rhobs/observability-data-model/blob/main/network-observability.md#format-proposal are applied.See
OpenTelemetryDefaultTransformRules
in constants.Rely on https://github.com/netobserv/flowlogs-pipeline/blob/main/contrib/opentelemetry/demo.md for Otel installation on your cluster
TODO: dig into ways to implement traces. Current traces configuration is disabled to avoid confusion.
Dependencies
netobserv/flowlogs-pipeline#672
netobserv/flowlogs-pipeline#684
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.