-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request to Add Detector for Kubernetes #4136
Comments
This is interesting, @dashpole may I pick up the work? I recently added k8s.cluster.uid for Would love to work on this feat as well, kindly assign :) |
Lets make sure maintainers are on-board first. @open-telemetry/go-approvers please let me know if there are any objections to the proposed new component. I'm happy to maintain it long-term. It might be a good idea to have this in the spec first. |
SGTM 👍 |
@anuraaga curious if you have feedback on the design above |
@dashpole should we introduce this to the spec? I don't think it should block this issue, but I would love to see a unified approach across all languages |
Thanks @dashpole - for reference, here is my manifest config
(btw, I couldn't find a way to get container name with downstream API though I notice this issue description mentions it, wonder if I missed it) This is indeed pretty annoying :) IIUC, the proposal would remove that OTEL_RESOURCE_ATTRIBUTES gnarly line, which is an improvement. However, I don't know how much of an improvement it is, skimming through the OTEP I saw one motivation is to have a copy-pastable snippet to get these filled. Unfortunately, the most important attribute Another issue is while the OTEP mentioned issues with merge and telemetry schema related to the So to sum up - if this detector existed, I would take advantage of the simplification by tweaking the variable names, which would be an improvement, but IMO not a big one. Given that this would need to be implemented in all the languages, there does seem to be low bang for the relatively large effort. As a user, personally I would be much happier if manpower, which I presume isn't unlimited, was spent on making sure all the instrumentation records metrics and more complete support for the SDK environment variables as while the variable setting is awkward, it's doable. |
I understand how they are different (application internal vs. external collector), but in practice there does appear to be significant overlap between this and the k8sattributes processor in the OpenTelemetry Collector. Meaning, how common is it that there are applications running in a kubernetes cluster where there is not already an OpenTelemetry Collector infrastructure deployed and collecting most of this metadata for all pods, regardless of language? Is the use case more for applications that are sending directly to an external-to-cluster endpoint? |
@evantorrie @anuraaga I can add some color here since this all started with a comment from me on @dashpole's PR. If you're coming from a Prometheus background, you really want an What sparked this for me is that I saw that the GCP detector, provided a cluster. Traditionally, this is not always easy to obtain because K8s doesn't bake cluster into any of the API objects; it's not part of the data model. So I was hoping to stitch together this data like so:
To @evantorrie's comments: Injecting the Pod details into that |
But to @anuraaga's point, given that this still relies on env variables, it might not really provide a lot of value especially if you need every SDK to conform to this. I could easily pluck this data from the |
Agreed. If you are already setting environment variables, it isn't much of an improvement. IMO, the main advantage is that even without setting any environment variables users would detect pod name and namespace name using fallbacks described above. That makes the "default" experience for less sophisticated users much better, as multiple pods within a deployment wouldn't have overlapping telemetry. |
Thanks @jaredjenkins - for me entering this conversation was exactly that, wondering why my metrics don't have pod name and then finding this https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/detectors/gcp/gke.go#L31 But as you confirmed, since this requires manifest changes anyways, and there will be non-k8s related resource attributes I need to set too, the impact feels low to me. It definitely needs to be in docs but not necessarily in a detector, for me at least. If everything could happen transparently/automatically with no manifest change, of course that would be a dream come true. So while it would take a while for a new version to become broadly usable, if a future k8s version provided such a metadata API, then it would be great (hint, hint in case anyone has friends in that org :) ). |
Ah yeah, I assumed there is some fundamental problem with the fallbacks hence the deprecation of the gke detector I linked. If the non-manifest based detection works fine then I don't see an issue with it, mostly was commenting on the processing of the env vars since it seemed to be the main focus of the proposal. |
The HOSTNAME env var defaults to the pod name in k8s, but can be overridden with pod.spec.hostname: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-hostname-and-subdomain-fields The namespace detection should always work. So they are pretty good, but not perfect. |
The HOSTNAME can also be changed with setHostnameAsFQDN |
Cool - in that case my suggestion would be to focus this proposal on that, the generic k8s detector that magically detects the two attributes. As long as it doesn't override OTEL_RESOURCE_ATTRIBUTES, which with sdk autoconfiguration defaults I don't think it would, it would provide default values that can be overridden in those special cases, and indeed that does seem to be a significant UX improvement. While that variable does have some issues as the otep calls out, they don't seem to be k8s specific so only having these additional k8s variables is probably not worth it and would make any spec change that much harder as new variables often don't make it in, xref open-telemetry/opentelemetry-specification#3573 open-telemetry/semantic-conventions#261 |
would resource.Merge fail since the schema url of OTEL_RESOURCE_ATTRIBUTES can differ from the schema_url used by the detector? Or will that always succeed |
Unfortunately I suspect it's going to be different among the SDKs as resource merging with different URLs isn't really well defined by the spec AFAIK. For example, Java will remove the schema That being said, IMO actually |
Background
At one point, I had written an OTep about how kubernetes resource detectors should function, but it wasn't accepted due to lack of interest. open-telemetry/oteps#195
I was recently reminded of the impact of this in #2310 (comment).
Proposed Solution
Add a Kubernetes resource detector in detectors/kubernetes/. It will have a single implementation of the resource.Detector interface, and no configuration.
It will detect:
From:
If pod name is not discovered, fall back to the HOSTNAME env var. This defaults to the pod's name, but can be overridden in the pod spec.
If pod namespace is not discovered, fall back to reading
/var/run/secrets/kubernetes.io/serviceaccount/namespace
. This is always the pod's namespace AFAIK, but isn't a recommended way to get it.Prior Art
OTel Operator: https://github.com/open-telemetry/opentelemetry-operator/blob/76cff59b0c0640da29c56d5ae91eae5fe843ae5b/pkg/constants/env.go#L25
Tasks
Dockerfile
file to build example application.docker-compose.yml
to run example in a docker environment to demonstrate instrumentation.@jaredjenkins
The text was updated successfully, but these errors were encountered: