Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes Attributes Processor adds wrong k8s.container.name value #34835

Open
martinohansen opened this issue Aug 23, 2024 · 6 comments
Open
Labels
bug Something isn't working processor/k8sattributes k8s Attributes processor Stale

Comments

@martinohansen
Copy link

martinohansen commented Aug 23, 2024

Component(s)

processor/k8sattributes

What happened?

Description

The Kubernetes Attributes Processor (k8sattributes) adds the wrong container name to pods with init container. I read metrics using the Prometheus receiver.

Steps to Reproduce

Setup processor to associate with pod ip, uid, and lastly connection details and pull k8s.container.name

processors:
  k8sattributes:
    extract:
      metadata:
        - k8s.container.name
    pod_association:
      - sources:
        - from: resource_attribute
          name: k8s.pod.ip
      - sources:
        - from: resource_attribute
          name: k8s.pod.uid
      - sources:
        - from: connection

Expose metrics from container foo with a pod spec like this

apiVersion: v1
kind: Pod
metadata:
  name: foo
spec:
  containers:
  - name: foo
    [...]
  initContainers:
  - name: linkerd-init
    [...]

Expected Result

{
    "resource": {
      "attributes": {
        "kube.container.name": "foo"
      }
  }
}

Actual Result

{
    "resource": {
      "attributes": {
        "kube.container.name": "linkerd-init"
      }
  }
}

Collector version

v0.107.0

@martinohansen martinohansen added bug Something isn't working needs triage New item requiring triage labels Aug 23, 2024
@github-actions github-actions bot added the processor/k8sattributes k8s Attributes processor label Aug 23, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@bacherfl
Copy link
Contributor

Hi @martinohansen! I am currently trying to reproduce this - Would you mind posting also the configuration of the prometheus receiver and other components within the metrics pipeline?
What confuses me a bit here is that the container attribute in the results is called kube.container.name, whereas in the k8sattribute processor it should be called k8s.container.name - was that a type or is it in fact called like that in the result? In this case it might be that this attribute is set somewhere else (maybe from the labels detected by the prometheus receiver) and the container name is not added at all by the k8sattributes processor, as this one requires either the k8s.container.name or container.id to be present at the time the resource is processed by the k8sattributes processor (see

Additional container level attributes can be extracted provided that certain resource attributes are provided:
)

@martinohansen
Copy link
Author

martinohansen commented Aug 27, 2024

Hi @martinohansen! I am currently trying to reproduce this - Would you mind posting also the configuration of the prometheus receiver and other components within the metrics pipeline?

Hi @bacherfl! Thanks for looking into this, I appreciate it. I will paste the full config at the end of my response.

What confuses me a bit here is that the container attribute in the results is called kube.container.name, whereas in the k8sattribute processor it should be called k8s.container.name - was that a type or is it in fact called like that in the result?

Ups, I'm sorry about that, it's a typo and yet it isn't. For consistency on the backend, we're renaming k8s_ to kube_ and I forgot to normalize that fact in the results. Sorry for the confusion.

transform/rename-to-kube:
  error_mode: ignore
  metric_statements:
    - context: resource
      statements:
        - replace_all_patterns(attributes, "key", "k8s\\.(.*)", "kube.$$1")

Here is the entire config:

# Collector
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: ${env:POD_IP}:4317
        max_recv_msg_size_mib: 64
      http:
        endpoint: ${env:POD_IP}:4318
processors:
  k8sattributes:
    extract:
      metadata:
        - k8s.container.name
        - k8s.namespace.name
        - k8s.pod.name
        - k8s.deployment.name
        - k8s.replicaset.name
        - k8s.node.name
        - k8s.daemonset.name
        - k8s.cronjob.name
        - k8s.job.name
        - k8s.statefulset.name
      labels:
        - tag_name: k8s.pod.label.app
          key: app
          from: pod
        - tag_name: k8s.pod.label.component
          key: component
          from: pod
        - tag_name: k8s.pod.label.zone
          key: zone
          from: pod
    pod_association:
      - sources:
        - from: resource_attribute
          name: k8s.pod.ip
      - sources:
        - from: resource_attribute
          name: k8s.pod.uid
      - sources:
        - from: connection
  transform/add-workload-label:
    metric_statements:
      - context: datapoint
        statements:
        - set(attributes["kube_workload_name"], resource.attributes["k8s.deployment.name"])
        - set(attributes["kube_workload_name"], resource.attributes["k8s.statefulset.name"])
        - set(attributes["kube_workload_type"], "deployment") where resource.attributes["k8s.deployment.name"] != nil
        - set(attributes["kube_workload_type"], "statefulset") where resource.attributes["k8s.statefulset.name"] != nil
  transform/rename-to-kube:
    error_mode: ignore
    metric_statements:
      - context: resource
        statements:
          - replace_all_patterns(attributes, "key", "k8s\\.(.*)", "kube.$$1")
exporters:
  otlphttp/pipeline-metrics:
    endpoint: ${env:OTLP_PIPELINE_METRICS_ENDPOINT}
    headers:
      Authorization: ${env:OTLP_PIPELINE_METRICS_TOKEN}
service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors:
      - k8sattributes
      - transform/add-workload-label
      - transform/rename-to-kube
      exporters: [otlphttp/pipeline-metrics]


# Agent
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: ${env:POD_IP}:4317
      http:
        endpoint: ${env:POD_IP}:4318
  prometheus:
    config:
      scrape_configs:
        - job_name: k8s
          tls_config:
            insecure_skip_verify: true
          scrape_interval: 15s
          kubernetes_sd_configs:
            - role: pod
              selectors:
                - role: pod
                  field: spec.nodeName=${env:NODE_NAME}
          relabel_configs:
            - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
              regex: "true"
              action: keep
            - action: replace
              source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
              target_label: __scheme__
              regex: (https?)
            - action: replace
              source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
              target_label: __metrics_path__
              regex: (.+)
            - action: replace
              source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
              regex: ([^:]+)(?::\d+)?;(\d+)
              replacement: $$1:$$2
              target_label: __address__
            # Allow overriding the scrape timeout and interval from pod
            # annotation.
            - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_timeout]
              regex: '(.+)'
              target_label: __scrape_timeout__
            - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_interval]
              regex: '(.+)'
              target_label: __scrape_interval__
exporters:
  otlp:
    endpoint: "otel-collector.otel.svc.cluster.local:4317"
    tls:
      insecure: true
    retry_on_failure:
      enabled: true
processors:
  batch:
  k8sattributes:
    passthrough: true
service:
  pipelines:
    metrics:
      receivers: [otlp, prometheus]
      processors: [batch, k8sattributes]
      exporters: [otlp]

P.s. I did remove some batching and memory limit config for simplicity since they are unrelated

@bacherfl
Copy link
Contributor

Thank you for the config @martinohansen ! I will try to reproduce the issue and will get back to you when I have gained more insights into what could be causing this

@bacherfl
Copy link
Contributor

bacherfl commented Aug 28, 2024

I did some tests now, and I discovered that using the kubernetes_sd_configs in the prometheus receiver will create a prometheus scrape target for each port within each container within the pod, including the init containers, where the following relabel config takes care of only keeping the target that is based on what is set for the prometheus port set in the annotations of the pod:

            - action: replace
              source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
              regex: ([^:]+)(?::\d+)?;(\d+)
              replacement: $$1:$$2
              target_label: __address__

In the example I was testing this I have a jaeger container with several ports exposed, so with that relabel config only the <pod-ip>:14269 ended up as a target, while the other ports were omitted:

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    linkerd.io/inject: disabled
    prometheus.io/port: "14269"
    prometheus.io/scrape: "true"
  labels:
    app: jaeger2
    app.kubernetes.io/component: all-in-one
    app.kubernetes.io/instance: jaeger2
    app.kubernetes.io/name: jaeger2
    app.kubernetes.io/part-of: jaeger2
  name: jaeger2
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: jaeger2
      app.kubernetes.io/component: all-in-one
      app.kubernetes.io/instance: jaeger2
      app.kubernetes.io/name: jaeger2
      app.kubernetes.io/part-of: jaeger2
  strategy:
    type: Recreate
  template:
    metadata:
      annotations:
        linkerd.io/inject: disabled
        prometheus.io/port: "14269"
        prometheus.io/scrape: "true"
        sidecar.istio.io/inject: "false"
      creationTimestamp: null
      labels:
        app: jaeger2
        app.kubernetes.io/component: all-in-one
        app.kubernetes.io/instance: jaeger2
        app.kubernetes.io/name: jaeger2
        app.kubernetes.io/part-of: jaeger2
    spec:
      initContainers:
        - name: init-myservice
          image: busybox
          command: [ 'sh', '-c', "echo 'init'" ]
      containers:
        - args:
            - --sampling.strategies-file=/etc/jaeger/sampling/sampling.json
          env:
            - name: SPAN_STORAGE_TYPE
              value: memory
            - name: METRICS_STORAGE_TYPE
            - name: COLLECTOR_ZIPKIN_HOST_PORT
              value: :9411
            - name: JAEGER_DISABLED
              value: "false"
            - name: COLLECTOR_OTLP_ENABLED
              value: "true"
          image: jaegertracing/all-in-one:1.53.0
          imagePullPolicy: IfNotPresent
          livenessProbe:
            failureThreshold: 5
            httpGet:
              path: /
              port: 14269
              scheme: HTTP
            initialDelaySeconds: 5
            periodSeconds: 15
            successThreshold: 1
            timeoutSeconds: 1
          name: jaeger
          ports:
            - containerPort: 5775
              name: zk-compact-trft
              protocol: UDP
            - containerPort: 5778
              name: config-rest
              protocol: TCP
            - containerPort: 6831
              name: jg-compact-trft
              protocol: UDP
            - containerPort: 6832
              name: jg-binary-trft
              protocol: UDP
            - containerPort: 9411
              name: zipkin
              protocol: TCP
            - containerPort: 14267
              name: c-tchan-trft
              protocol: TCP
            - containerPort: 14268
              name: c-binary-trft
              protocol: TCP
            - containerPort: 16685
              name: grpc-query
              protocol: TCP
            - containerPort: 16686
              name: query
              protocol: TCP
            - containerPort: 14269
              name: admin-http
              protocol: TCP
            - containerPort: 14250
              name: grpc
              protocol: TCP
            - containerPort: 4317
              name: grpc-otlp
              protocol: TCP
            - containerPort: 4318
              name: http-otlp
              protocol: TCP

However, for init containers which mostly do not have a port defined, this rule does not catch this, and a separate target with the same endpoint will be created, so the same endpoint will effectively be called twice during each scrape, yielding the same set of metrics, but for different attribute sets - one including the name of the init container, and the other the correct container - the OTel resource has the same name in both cases, so that might explain why the container name ends up being set incorrectly at the end. What I found as a potential workaround is an additional relabel config to exclude targets created for init containers - the prometheus library internally sets this attribute:

            - source_labels: [ __meta_kubernetes_pod_container_init ]
              regex: "false"
              action: keep

would that be an option for you @martinohansen?

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working processor/k8sattributes k8s Attributes processor Stale
Projects
None yet
Development

No branches or pull requests

3 participants