Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8sprocessor: add container.name field #1167

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 22 additions & 7 deletions processor/k8sprocessor/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,23 +41,26 @@ func TestLoadConfig(t *testing.T) {
config, err := configtest.LoadConfigFile(
t,
path.Join(".", "testdata", "config.yaml"),
factories)
factories,
)

require.Nil(t, err)
require.NoError(t, err)
require.NotNil(t, config)

p0 := config.Processors["k8s_tagger"]
assert.Equal(t, p0,
assert.Equal(t,
&Config{
ProcessorSettings: configmodels.ProcessorSettings{
TypeVal: "k8s_tagger",
NameVal: "k8s_tagger",
},
APIConfig: k8sconfig.APIConfig{AuthType: k8sconfig.AuthTypeServiceAccount},
})
},
p0,
)

p1 := config.Processors["k8s_tagger/2"]
assert.Equal(t, p1,
assert.Equal(t,
&Config{
ProcessorSettings: configmodels.ProcessorSettings{
TypeVal: "k8s_tagger",
Expand All @@ -66,7 +69,17 @@ func TestLoadConfig(t *testing.T) {
APIConfig: k8sconfig.APIConfig{AuthType: k8sconfig.AuthTypeKubeConfig},
Passthrough: false,
Extract: ExtractConfig{
Metadata: []string{"podName", "podUID", "deployment", "cluster", "namespace", "node", "startTime"},
Metadata: []string{
"cluster",
"containerName",
"deployment",
"hostName",
"namespace",
"node",
"podName",
"podUID",
"startTime",
},
Annotations: []FieldExtractConfig{
{TagName: "a1", Key: "annotation-one"},
{TagName: "a2", Key: "annotation-two", Regex: "field=(?P<value>.+)"},
Expand All @@ -89,5 +102,7 @@ func TestLoadConfig(t *testing.T) {
{Key: "key2", Value: "value2", Op: "not-equals"},
},
},
})
},
p1,
)
}
14 changes: 14 additions & 0 deletions processor/k8sprocessor/kube/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ package kube
import (
"fmt"
"regexp"
"sort"
"strings"
"sync"
"time"
Expand Down Expand Up @@ -235,6 +236,19 @@ func (c *WatchClient) extractPodAttributes(pod *api_v1.Pod) map[string]string {
tags[r.Name] = c.extractField(v, r)
}
}

if c.Rules.ContainerName {
if len(pod.Spec.Containers) > 0 {
var names []string
for _, container := range pod.Spec.Containers {
names = append(names, container.Name)
}

sort.Strings(names)
tags[conventions.AttributeK8sContainer] = strings.Join(names, ",")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps ", " (with a space after the comma) to make it more clear? Or using the array (AnyValue_ArrayValue) type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k8s.container.name should be about a single container. I assume we can't do that here because we correlate based on IP and the IP is shared across all the containers in the pod. If we make this an array (seems better) I think we should also consider adding a new semantic convention of k8s.pod.containers for this purpose.

Copy link
Contributor

@pmm-sumo pmm-sumo Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked around semantic conventions to see if there's any other attribute troubled by a similar property (usually being a single value, but sometimes being a collection). The closest I get is process.command_line, but there are really no other examples I could find.

The same problem will apply to Kubernetes Service - there might be several of them pointing to a single pod.

I am not sure if this ambiguity could be solved in any way currently (@dashpole, perhaps you might have some suggestions?).

This leads to several specification-related questions, such as:

  • should there bet two attributes (one singular, one plural) depending if there's (typically) one or (sometimes) multiple values or, only one of them?
  • if there's only one attribute name, should it be plural or singular when there might be a collection of values for a key?
  • if there's only one attribute name, what should be the type of attribute value (always an array or perhaps a string when there's only single entry)?

For reference, there's a an ongoing discussion on pluralization guidelines for metrics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a metric about a pod (e.g. is correlated with the Pod's IP), then IMO it shouldn't have a container attribute. If it is about a container, then it should have a single container name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dashpole so essentially when a single container cannot be accurately assigned then the information should be not included, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at least in my opinion.

From monitoring systems i'm familiar with, I haven't seen a good way to handle lists of items in label (attribute) values.

The other thing to consider is that technically containers can be added to a running pod with the debug container feature, so this attribute wouldn't necessarily be constant over the lifetime of a pod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other thing to consider is that technically containers can be added to a running pod with the debug container feature, so this attribute wouldn't necessarily be constant over the lifetime of a pod.

Even though this seems like a niche use case I agree that it might introduce incorrect data when we'd tag as above.

From monitoring systems i'm familiar with, I haven't seen a good way to handle lists of items in label (attribute) values.

As for this I was trying to approach tags rewrite to not be of map[string]string type but rather a map of string to AnyValue to have the tags already in the type that they will be serialized to.

This could make it easier as having a plural tag as key would already indicate there's an array in the value.

}
}

return tags
}

Expand Down
37 changes: 26 additions & 11 deletions processor/k8sprocessor/kube/client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,7 @@ func TestHandlerWrongType(t *testing.T) {
func TestExtractionRules(t *testing.T) {
c, _ := newTestClientWithRulesAndFilters(t, ExtractionRules{}, Filters{})

pod := &api_v1.Pod{
pod1 := &api_v1.Pod{
ObjectMeta: meta_v1.ObjectMeta{
Name: "auth-service-abc12-xyz3",
UID: "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
Expand All @@ -323,6 +323,14 @@ func TestExtractionRules(t *testing.T) {
},
Spec: api_v1.PodSpec{
NodeName: "node1",
Containers: []api_v1.Container{
{
Name: "container-zzzzz",
},
{
Name: "sidecar-container-aaaaa",
},
},
},
Status: api_v1.PodStatus{
PodIP: "1.1.1.1",
Expand All @@ -333,10 +341,12 @@ func TestExtractionRules(t *testing.T) {
name string
rules ExtractionRules
attributes map[string]string
pod *api_v1.Pod
}{{
name: "no-rules",
rules: ExtractionRules{},
attributes: nil,
pod: pod1,
}, {
name: "deployment",
rules: ExtractionRules{
Expand All @@ -345,16 +355,18 @@ func TestExtractionRules(t *testing.T) {
attributes: map[string]string{
"k8s.deployment.name": "auth-service",
},
pod: pod1,
}, {
name: "metadata",
rules: ExtractionRules{
Deployment: true,
Namespace: true,
PodName: true,
PodUID: true,
Node: true,
Cluster: true,
StartTime: true,
Deployment: true,
Namespace: true,
PodName: true,
PodUID: true,
Node: true,
Cluster: true,
StartTime: true,
ContainerName: true,
},
attributes: map[string]string{
"k8s.deployment.name": "auth-service",
Expand All @@ -363,8 +375,10 @@ func TestExtractionRules(t *testing.T) {
"k8s.node.name": "node1",
"k8s.pod.name": "auth-service-abc12-xyz3",
"k8s.pod.uid": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"k8s.pod.startTime": pod.GetCreationTimestamp().String(),
"k8s.pod.startTime": pod1.GetCreationTimestamp().String(),
"k8s.container.name": "container-zzzzz,sidecar-container-aaaaa",
},
pod: pod1,
}, {
name: "labels",
rules: ExtractionRules{
Expand All @@ -388,13 +402,14 @@ func TestExtractionRules(t *testing.T) {
"l2": "v5",
"a1": "av1",
},
pod: pod1,
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
c.Rules = tc.rules
c.handlePodAdd(pod)
p, ok := c.GetPodByIP(pod.Status.PodIP)
c.handlePodAdd(tc.pod)
p, ok := c.GetPodByIP(tc.pod.Status.PodIP)
require.True(t, ok)

assert.Equal(t, len(tc.attributes), len(p.Attributes))
Expand Down
15 changes: 8 additions & 7 deletions processor/k8sprocessor/kube/kube.go
Original file line number Diff line number Diff line change
Expand Up @@ -102,13 +102,14 @@ type FieldFilter struct {
// ExtractionRules is used to specify the information that needs to be extracted
// from pods and added to the spans as tags.
type ExtractionRules struct {
Deployment bool
Namespace bool
PodName bool
PodUID bool
Node bool
Cluster bool
StartTime bool
Deployment bool
Namespace bool
PodName bool
PodUID bool
Node bool
Cluster bool
StartTime bool
ContainerName bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would ContainerNames make more sense as we add all names we see?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this but taking into account @dmitryax's comment I'm not sure it will fit since the tag k8s.container.name is in singular.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep it singular here for consistency, even if it may have multiple values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 done. Singular it is.


Annotations []FieldExtractionRule
Labels []FieldExtractionRule
Expand Down
40 changes: 22 additions & 18 deletions processor/k8sprocessor/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,15 @@ const (
filterOPExists = "exists"
filterOPDoesNotExist = "does-not-exist"

metdataNamespace = "namespace"
metadataPodName = "podName"
metadataPodUID = "podUID"
metadataStartTime = "startTime"
metadataDeployment = "deployment"
metadataCluster = "cluster"
metadataNode = "node"
metadataNamespace = "namespace"
metadataPodName = "podName"
metadataPodUID = "podUID"
metadataStartTime = "startTime"
metadataDeployment = "deployment"
metadataCluster = "cluster"
metadataNode = "node"
metadataHostName = "hostName"
metadataContainerNames = "containerNames"
)

// Option represents a configuration option that can be passes.
Expand Down Expand Up @@ -68,31 +70,33 @@ func WithExtractMetadata(fields ...string) Option {
return func(p *kubernetesprocessor) error {
if len(fields) == 0 {
fields = []string{
metdataNamespace,
metadataCluster,
metadataDeployment,
metadataNamespace,
metadataNode,
metadataPodName,
metadataPodUID,
metadataStartTime,
metadataDeployment,
metadataCluster,
metadataNode,
}
}
for _, field := range fields {
switch field {
case metdataNamespace:
case metadataCluster:
p.rules.Cluster = true
case metadataContainerNames:
p.rules.ContainerName = true
case metadataDeployment:
p.rules.Deployment = true
case metadataNamespace:
p.rules.Namespace = true
case metadataNode:
p.rules.Node = true
case metadataPodName:
p.rules.PodName = true
case metadataPodUID:
p.rules.PodUID = true
case metadataStartTime:
p.rules.StartTime = true
case metadataDeployment:
p.rules.Deployment = true
case metadataCluster:
p.rules.Cluster = true
case metadataNode:
p.rules.Node = true
default:
return fmt.Errorf("\"%s\" is not a supported metadata field", field)
}
Expand Down
4 changes: 4 additions & 0 deletions processor/k8sprocessor/options_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -198,13 +198,16 @@ func TestWithExtractLabels(t *testing.T) {
func TestWithExtractMetadata(t *testing.T) {
p := &kubernetesprocessor{}
assert.NoError(t, WithExtractMetadata()(p))
// Default
assert.True(t, p.rules.Namespace)
assert.True(t, p.rules.PodName)
assert.True(t, p.rules.PodUID)
assert.True(t, p.rules.StartTime)
assert.True(t, p.rules.Deployment)
assert.True(t, p.rules.Cluster)
assert.True(t, p.rules.Node)
// Optional
assert.False(t, p.rules.ContainerName)

p = &kubernetesprocessor{}
err := WithExtractMetadata("randomfield")(p)
Expand All @@ -219,6 +222,7 @@ func TestWithExtractMetadata(t *testing.T) {
assert.False(t, p.rules.StartTime)
assert.False(t, p.rules.Deployment)
assert.False(t, p.rules.Node)
assert.False(t, p.rules.ContainerName)
}

func TestWithFilterLabels(t *testing.T) {
Expand Down
8 changes: 5 additions & 3 deletions processor/k8sprocessor/testdata/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@ processors:
extract:
metadata:
# extract the following well-known metadata fields
- podName
- podUID
- deployment
- cluster
- containerName
- deployment
- hostName
- namespace
- node
- podName
- podUID
- startTime

annotations:
Expand Down