-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k8sprocessor: add container.name field #1167
k8sprocessor: add container.name field #1167
Conversation
Not sure why the tests failed:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some nits.
Cluster bool | ||
StartTime bool | ||
HostName bool | ||
ContainerName bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would ContainerNames
make more sense as we add all names we see?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this but taking into account @dmitryax's comment I'm not sure it will fit since the tag k8s.container.name
is in singular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can keep it singular here for consistency, even if it may have multiple values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 done. Singular it is.
} | ||
|
||
sort.Strings(names) | ||
tags[conventions.AttributeContainerName] = strings.Join(names, ",") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to semantic conventions
https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/resource/semantic_conventions/k8s.md#container, if we are using k8s spec as the source, it's not a real container name. container.name
supposed to be provided by container engine. We should use k8s.container.name
here instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done but please comment whether it makes sense to have configuration option in plural and the tag in singular.
if pod.Spec.Hostname == "" { | ||
tags[conventions.AttributeHostName] = pod.Name | ||
} else { | ||
tags[conventions.AttributeHostName] = pod.Spec.Hostname | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hostname := pod.Spec.Hostname
if hostname == "" {
hostname = pod.Name
}
tags[conventions.AttributeHostName] = hostname
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reasons for this? (other than maybe accessing pod.Spec.Hostname
just once)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmalek-sumo, I think such approach makes the intention more clear. Currently, you check pod.Spec.Hostname
and assign pod.Name
which is not obvious why, unless the line below is being read
Of course, that might be a small thing, but one where clarity could be improved easily
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 done
Codecov Report
@@ Coverage Diff @@
## master #1167 +/- ##
==========================================
- Coverage 89.89% 89.86% -0.03%
==========================================
Files 276 276
Lines 13621 13629 +8
==========================================
+ Hits 12244 12248 +4
- Misses 1011 1014 +3
- Partials 366 367 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Added missing test |
if hostname == "" { | ||
hostname = pod.Name | ||
} | ||
tags[conventions.AttributeHostName] = hostname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's safe to reuse host.name
attribute here. It can be easily confused with hostname of the node where the pod is running on. What is the use case? Do we have to reuse host.name
attribute? if not, we might want to use another attribute with a more specific name like k8s.pod.hostname
. what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmitryax+1 on k8s.pod.hostname
(I think there are several attributes that could be easily added to semantic conventions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed @dmitryax.
We'll need to add it to https://github.com/open-telemetry/opentelemetry-collector/blob/master/translator/conventions/opentelemetry.go.
@pmm-sumo mentioned he'd like to contribute that so let's wait for that and use it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have started with opening a PR to specs: open-telemetry/opentelemetry-specification#1072
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a discussion under the PR. I am wondering if maybe host.hostname
is a better fit here. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmm-sumo the host.hostname
was recently removed from the conventions open-telemetry/opentelemetry-specification#838
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to recap, we had a discussion under the specification PR. Since hostname
is ambiguous in this setting (containers host names vs the machine where it runs host name), it is unfortunately not a simple problem to solve. Additionally, it is present on all containerised environments and need to be addressed in a generic way.
To avoid taking focus out of the GA, the suggestion is to skip adding this feature for now and go back to it later, with a solid discussion involving several parties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, most of the time, the node hostname will reside in k8s.node.name
and pod hostname in k8s.pod.name
unless they are overwritten (which I believe is not a very common scenario).
a65f246
to
9502d95
Compare
@owais you know k8s processor best, please review. |
@pmalek-sumo could you remove the |
Sure thing, done. |
} | ||
|
||
sort.Strings(names) | ||
tags[conventions.AttributeK8sContainer] = strings.Join(names, ",") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps ", "
(with a space after the comma) to make it more clear? Or using the array (AnyValue_ArrayValue
) type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k8s.container.name should be about a single container. I assume we can't do that here because we correlate based on IP and the IP is shared across all the containers in the pod. If we make this an array (seems better) I think we should also consider adding a new semantic convention of k8s.pod.containers
for this purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked around semantic conventions to see if there's any other attribute troubled by a similar property (usually being a single value, but sometimes being a collection). The closest I get is process.command_line, but there are really no other examples I could find.
The same problem will apply to Kubernetes Service - there might be several of them pointing to a single pod.
I am not sure if this ambiguity could be solved in any way currently (@dashpole, perhaps you might have some suggestions?).
This leads to several specification-related questions, such as:
- should there bet two attributes (one singular, one plural) depending if there's (typically) one or (sometimes) multiple values or, only one of them?
- if there's only one attribute name, should it be plural or singular when there might be a collection of values for a key?
- if there's only one attribute name, what should be the type of attribute value (always an array or perhaps a string when there's only single entry)?
For reference, there's a an ongoing discussion on pluralization guidelines for metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a metric about a pod (e.g. is correlated with the Pod's IP), then IMO it shouldn't have a container attribute. If it is about a container, then it should have a single container name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dashpole so essentially when a single container cannot be accurately assigned then the information should be not included, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, at least in my opinion.
From monitoring systems i'm familiar with, I haven't seen a good way to handle lists of items in label (attribute) values.
The other thing to consider is that technically containers can be added to a running pod with the debug container feature, so this attribute wouldn't necessarily be constant over the lifetime of a pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other thing to consider is that technically containers can be added to a running pod with the debug container feature, so this attribute wouldn't necessarily be constant over the lifetime of a pod.
Even though this seems like a niche use case I agree that it might introduce incorrect data when we'd tag as above.
From monitoring systems i'm familiar with, I haven't seen a good way to handle lists of items in label (attribute) values.
As for this I was trying to approach tags rewrite to not be of map[string]string
type but rather a map of string
to AnyValue
to have the tags already in the type that they will be serialized to.
This could make it easier as having a plural tag as key would already indicate there's an array in the value.
@pmalek-sumo @pmm-sumo can you clarify when we would expect to see an annotation with multiple container names? I ask because I agree with @dashpole here that we should either use a singlar value or not set this attribute, especially when considering metrics about pods and containers. I'd lean in the direction of a singular attribute name. I tend to think of this as a single list or array containing multiple values, but I'm curious about the specifics in this case. |
I agree that would be the preferred approach. One thing I am wondering if in such case the attribute should be set only when a single container exists (and not set when there are multiple containers) or never set |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Since there has been no conclusion on this and the API as of now revolves around pods (not containers, hence we can't map 1:1 from which container the resource is coming from) it's hard to tag with container information consistently. In case the discussion get traction again, I'm willing to continue work on this. |
* Add more semantic conventions for k8s * Update CHANGELOG Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>
Description: This PR adds option to tag data with
container.name
field.Link to tracking Issue: #141
Testing: Unit tests added.