Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

Missing option to disable the daemonset in metricbeat #702

Closed
erihanse opened this issue Jul 1, 2020 · 8 comments · Fixed by #716
Closed

Missing option to disable the daemonset in metricbeat #702

erihanse opened this issue Jul 1, 2020 · 8 comments · Fixed by #716
Labels
enhancement New feature or request metricbeat

Comments

@erihanse
Copy link
Contributor

erihanse commented Jul 1, 2020

Chart version:
7.8.0
Kubernetes version:
1.15.9
Kubernetes provider: E.g. GKE (Google Kubernetes Engine)
on-premise
Helm Version:
2.12.2

helm get release output

Output of helm get release
REVISION: 1
RELEASED: Tue Jun 30 20:48:16 2020
CHART: elasticsearch-7.3.2
USER-SUPPLIED VALUES:
clusterHealthCheckParams: wait_for_status=yellow&timeout=10s
minimumMasterNodes: 1
nodeSelector:
  kubernetes.io/hostname: demoworker2test
podSecurityPolicy:
  name: 50-rootfilesystem
rbac:
  create: true
  serviceAccountName: elasticsearch-master
readinessProbe:
  initialDelaySeconds: 30
  timeoutSeconds: 20
replicas: 1
resources:
  limits:
    memory: 2.5Gi
  requests:
    memory: 1Gi
sysctlInitContainer:
  enabled: false

COMPUTED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=yellow&timeout=10s
clusterName: elasticsearch
esConfig: {}
esJavaOpts: -Xmx1g -Xms1g
esMajorVersion: ""
extraEnvs: []
extraInitContainers: ""
extraVolumeMounts: ""
extraVolumes: ""
fsGroup: ""
fullnameOverride: ""
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 7.3.2
ingress:
  annotations: {}
  enabled: false
  hosts:
  - chart-example.local
  path: /
  tls: []
initResources: {}
keystore: []
labels: {}
lifecycle: {}
masterService: ""
masterTerminationFix: false
maxUnavailable: 1
minimumMasterNodes: 1
nameOverride: ""
networkHost: 0.0.0.0
nodeAffinity: {}
nodeGroup: master
nodeSelector:
  kubernetes.io/hostname: demoworker2test
persistence:
  annotations: {}
  enabled: true
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
  fsGroup: 1000
  runAsUser: 1000
podSecurityPolicy:
  create: false
  name: 50-rootfilesystem
  spec:
    fsGroup:
      rule: RunAsAny
    privileged: true
    runAsUser:
      rule: RunAsAny
    seLinux:
      rule: RunAsAny
    supplementalGroups:
      rule: RunAsAny
    volumes:
    - secret
    - configMap
    - persistentVolumeClaim
priorityClassName: ""
protocol: http
rbac:
  create: true
  serviceAccountName: elasticsearch-master
readinessProbe:
  failureThreshold: 3
  initialDelaySeconds: 30
  periodSeconds: 10
  successThreshold: 3
  timeoutSeconds: 20
replicas: 1
resources:
  limits:
    cpu: 1000m
    memory: 2.5Gi
  requests:
    cpu: 100m
    memory: 1Gi
roles:
  data: "true"
  ingest: "true"
  master: "true"
schedulerName: ""
secretMounts: []
securityContext:
  capabilities:
    drop:
    - ALL
  runAsNonRoot: true
  runAsUser: 1000
service:
  annotations: {}
  httpPortName: http
  nodePort: null
  transportPortName: transport
  type: ClusterIP
sidecarResources: {}
sysctlInitContainer:
  enabled: false
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi

HOOKS:
---
# elasticsearch-evlyd-test
apiVersion: v1
kind: Pod
metadata:
  name: "elasticsearch-evlyd-test"
  annotations:
    "helm.sh/hook": test-success
spec:
  containers:
  - name: "elasticsearch-ldhnq-test"
    image: "docker.elastic.co/elasticsearch/elasticsearch:7.3.2"
    command:
      - "sh"
      - "-c"
      - |
        #!/usr/bin/env bash -e
        curl -XGET --fail 'elasticsearch-master:9200/_cluster/health?wait_for_status=yellow&timeout=10s'
  restartPolicy: Never
MANIFEST:

---
# Source: elasticsearch/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: "elasticsearch-master-pdb"
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: "elasticsearch-master"
---
# Source: elasticsearch/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: "elasticsearch-master"
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch-7.3.2"
    app: "elasticsearch-master"
---
# Source: elasticsearch/templates/role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: "elasticsearch-master"
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch-7.3.2"
    app: "elasticsearch-master"
rules:
  - apiGroups:
      - extensions
    resources:
      - podsecuritypolicies
    resourceNames:
      - "50-rootfilesystem"
    verbs:
      - use
---
# Source: elasticsearch/templates/rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: "elasticsearch-master"
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch-7.3.2"
    app: "elasticsearch-master"
subjects:
  - kind: ServiceAccount
    name: "elasticsearch-master"
    namespace: "elastic"
roleRef:
  kind: Role
  name: "elasticsearch-master"
  apiGroup: rbac.authorization.k8s.io
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    {}
    
spec:
  type: ClusterIP
  selector:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  ports:
  - name: http
    protocol: TCP
    port: 9200
  - name: transport
    protocol: TCP
    port: 9300
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master-headless
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
  clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve
  # Create endpoints also if the related pod isn't ready
  publishNotReadyAddresses: true
  selector:
    app: "elasticsearch-master"
  ports:
  - name: http
    port: 9200
  - name: transport
    port: 9300
---
# Source: elasticsearch/templates/statefulset.yaml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    esMajorVersion: "7"
spec:
  serviceName: elasticsearch-master-headless
  selector:
    matchLabels:
      app: "elasticsearch-master"
  replicas: 1
  podManagementPolicy: Parallel
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-master
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 30Gi
      
  template:
    metadata:
      name: "elasticsearch-master"
      labels:
        heritage: "Tiller"
        release: "elasticsearch"
        chart: "elasticsearch"
        app: "elasticsearch-master"
      annotations:
        
    spec:
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
        
      serviceAccountName: "elasticsearch-master"
      nodeSelector:
        kubernetes.io/hostname: demoworker2test
        
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - "elasticsearch-master"
            topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 120
      volumes:
      initContainers:

      containers:
      - name: "elasticsearch"
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          runAsUser: 1000
          
        image: "docker.elastic.co/elasticsearch/elasticsearch:7.3.2"
        imagePullPolicy: "IfNotPresent"
        readinessProbe:
          failureThreshold: 3
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 3
          timeoutSeconds: 20
          
          exec:
            command:
              - sh
              - -c
              - |
                #!/usr/bin/env bash -e
                # If the node is starting up wait for the cluster to be ready (request params: 'wait_for_status=yellow&timeout=10s' )
                # Once it has started only check that the node itself is responding
                START_FILE=/tmp/.es_start_file

                http () {
                    local path="${1}"
                    if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
                      BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
                    else
                      BASIC_AUTH=''
                    fi
                    curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
                }

                if [ -f "${START_FILE}" ]; then
                    echo 'Elasticsearch is already running, lets check the node is healthy'
                    http "/"
                else
                    echo 'Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=yellow&timeout=10s" )'
                    if http "/_cluster/health?wait_for_status=yellow&timeout=10s" ; then
                        touch ${START_FILE}
                        exit 0
                    else
                        echo 'Cluster is not yet ready (request params: "wait_for_status=yellow&timeout=10s" )'
                        exit 1
                    fi
                fi
        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          limits:
            cpu: 1000m
            memory: 2.5Gi
          requests:
            cpu: 100m
            memory: 1Gi
          
        env:
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: cluster.initial_master_nodes
            value: "elasticsearch-master-0,"
          - name: discovery.seed_hosts
            value: "elasticsearch-master-headless"
          - name: cluster.name
            value: "elasticsearch"
          - name: network.host
            value: "0.0.0.0"
          - name: ES_JAVA_OPTS
            value: "-Xmx1g -Xms1g"
          - name: node.data
            value: "true"
          - name: node.ingest
            value: "true"
          - name: node.master
            value: "true"
        volumeMounts:
          - name: "elasticsearch-master"
            mountPath: /usr/share/elasticsearch/data

Describe the bug:
Eearlier we have used the metricbeat chart from the stable repository, https://github.com/helm/charts/tree/master/stable/metricbeat. In that chart we could decide whether to enable/disable the daemonset and deployment. As we know, that chart is deprecated. We're trying to migrate to this elastic chart, but there seems to be no way of disabling the daemonset here. To quote https://www.elastic.co/guide/en/beats/metricbeat/7.x/metricbeat-module-kubernetes.html:

Some of the previous components are running on each of the Kubernetes nodes (like kubelet or proxy) while others provide a single cluster-wide endpoint. This is important to determine the optimal configuration and running strategy for the different metricsets included in the module.

We don't use metricbeat for other things than k8s events, so this seems difficult to do with the state of this chart. Is there a reason both the daemonset and deployment is now mandatory?

@blurpy
Copy link

blurpy commented Jul 6, 2020

I would also like an option to disable the daemonset.

@fatmcgav
Copy link
Contributor

fatmcgav commented Jul 6, 2020

@erihanse thank you opening this issue.

You are correct in that this appears to be a functional gap between the 2 charts.

Would you be interested in opening a PR to add a flag to enable/disable both the deployment and the daemonset?

For backwards compatability reasons they will both need to default to enabled.

@fatmcgav fatmcgav added enhancement New feature or request metricbeat labels Jul 6, 2020
@erihanse
Copy link
Contributor Author

erihanse commented Jul 6, 2020

@fatmcgav sure, I'll give it a try :-)

@fatmcgav
Copy link
Contributor

fatmcgav commented Jul 6, 2020

@erihanse cool. We can provide feedback and pointers etc if you need any assistance.

@erihanse
Copy link
Contributor Author

erihanse commented Jul 7, 2020

@fatmcgav I tried making sense of https://github.com/elastic/helm-charts/blob/master/CONTRIBUTING.md#submitting-a-pull-request, but I still wonder: will the changes be backported to older versions? We're currently on 7.3.2.

Another question: I can't make the test pass.

➜  metricbeat git:(master) git pull upstream master
From github.com:elastic/helm-charts
 * branch            master     -> FETCH_HEAD
Already up to date.
➜  metricbeat git:(master) make pytest
pytest -sv --color=yes
================================================================================== test session starts ==================================================================================
platform linux -- Python 3.6.9, pytest-4.1.0, py-1.8.0, pluggy-0.13.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /home/erikah/ok/plattform/helm-charts/metricbeat, inifile:
collected 32 items                                                                                                                                                                      

tests/metricbeat_test.py::test_defaults FAILED
tests/metricbeat_test.py::test_adding_a_extra_container PASSED
tests/metricbeat_test.py::test_adding_a_extra_init_container PASSED
tests/metricbeat_test.py::test_adding_envs PASSED
tests/metricbeat_test.py::test_adding_deprecated_envs PASSED
tests/metricbeat_test.py::test_adding_image_pull_secrets PASSED
tests/metricbeat_test.py::test_adding_host_networking PASSED
tests/metricbeat_test.py::test_adding_tolerations PASSED
tests/metricbeat_test.py::test_adding_deprecated_tolerations PASSED
tests/metricbeat_test.py::test_override_the_default_update_strategy PASSED
tests/metricbeat_test.py::test_setting_a_custom_service_account PASSED
tests/metricbeat_test.py::test_self_managing_rbac_resources PASSED
tests/metricbeat_test.py::test_setting_pod_security_context PASSED
tests/metricbeat_test.py::test_setting_deprecated_pod_security_context PASSED
tests/metricbeat_test.py::test_adding_in_metricbeat_config PASSED
tests/metricbeat_test.py::test_adding_in_deprecated_metricbeat_config PASSED
tests/metricbeat_test.py::test_adding_a_secret_mount PASSED
tests/metricbeat_test.py::test_adding_a_deprecated_secret_mount PASSED
tests/metricbeat_test.py::test_adding_a_extra_volume_with_volume_mount PASSED
tests/metricbeat_test.py::test_adding_a_deprecated_extra_volume_with_volume_mount PASSED
tests/metricbeat_test.py::test_adding_a_node_selector PASSED
tests/metricbeat_test.py::test_adding_deprecated_node_selector PASSED
tests/metricbeat_test.py::test_adding_an_affinity_rule PASSED
tests/metricbeat_test.py::test_priority_class_name PASSED
tests/metricbeat_test.py::test_cluster_role_rules PASSED
tests/metricbeat_test.py::test_adding_pod_labels PASSED
tests/metricbeat_test.py::test_adding_serviceaccount_annotations PASSED
tests/metricbeat_test.py::test_adding_env_from PASSED
tests/metricbeat_test.py::test_adding_deprecated_env_from PASSED
tests/metricbeat_test.py::test_overriding_resources PASSED
tests/metricbeat_test.py::test_adding_deprecated_resources PASSED
tests/metricbeat_test.py::test_setting_fullnameOverride FAILED

======================================================================================= FAILURES ========================================================================================
_____________________________________________________________________________________ test_defaults _____________________________________________________________________________________

    def test_defaults():
        config = """
        """
    
        r = helm_template(config)
    
        assert name in r["daemonset"]
    
        c = r["daemonset"][name]["spec"]["template"]["spec"]["containers"][0]
        assert c["name"] == project
        assert c["image"].startswith("docker.elastic.co/beats/" + project + ":")
    
        assert c["env"][0]["name"] == "POD_NAMESPACE"
        assert c["env"][0]["valueFrom"]["fieldRef"]["fieldPath"] == "metadata.namespace"
    
        assert "curl --fail 127.0.0.1:5066" in c["livenessProbe"]["exec"]["command"][-1]
    
        assert "metricbeat test output" in c["readinessProbe"]["exec"]["command"][-1]
    
        assert r["daemonset"][name]["spec"]["template"]["spec"]["tolerations"] == []
    
        assert "hostNetwork" not in r["daemonset"][name]["spec"]["template"]["spec"]
        assert "dnsPolicy" not in r["daemonset"][name]["spec"]["template"]["spec"]
        assert (
            "hostNetwork"
            not in r["deployment"][name + "-metrics"]["spec"]["template"]["spec"]
        )
        assert (
            "dnsPolicy"
            not in r["deployment"][name + "-metrics"]["spec"]["template"]["spec"]
        )
    
        assert (
            r["deployment"][name + "-metrics"]["spec"]["template"]["spec"]["tolerations"]
            == []
        )
    
        assert (
            r["daemonset"][name]["spec"]["template"]["spec"]["containers"][0][
                "securityContext"
            ]["runAsUser"]
            == 0
        )
        assert (
            r["daemonset"][name]["spec"]["template"]["spec"]["containers"][0][
                "securityContext"
            ]["privileged"]
            == False
        )
        assert (
            r["deployment"][name + "-metrics"]["spec"]["template"]["spec"]["containers"][0][
                "securityContext"
            ]["runAsUser"]
            == 0
        )
        assert (
            r["deployment"][name + "-metrics"]["spec"]["template"]["spec"]["containers"][0][
                "securityContext"
            ]["privileged"]
            == False
        )
    
        # Empty customizable defaults
        assert "imagePullSecrets" not in r["daemonset"][name]["spec"]["template"]["spec"]
    
        assert r["daemonset"][name]["spec"]["updateStrategy"]["type"] == "RollingUpdate"
    
        assert (
            r["daemonset"][name]["spec"]["template"]["spec"]["serviceAccountName"] == name
        )
    
        cfg = r["configmap"]
    
        assert name + "-config" not in cfg
        assert name + "-daemonset-config" in cfg
        assert name + "-deployment-config" in cfg
    
        assert "metricbeat.yml" in cfg[name + "-daemonset-config"]["data"]
        assert "metricbeat.yml" in cfg[name + "-deployment-config"]["data"]
    
        assert "module: system" in cfg[name + "-daemonset-config"]["data"]["metricbeat.yml"]
        assert (
            "module: system"
            not in cfg[name + "-deployment-config"]["data"]["metricbeat.yml"]
        )
        assert "state_pod" not in cfg[name + "-daemonset-config"]["data"]["metricbeat.yml"]
        assert "state_pod" in cfg[name + "-deployment-config"]["data"]["metricbeat.yml"]
    
        daemonset = r["daemonset"][name]["spec"]["template"]["spec"]
    
        assert {
            "configMap": {"name": name + "-config", "defaultMode": 0o600},
            "name": project + "-config",
        } not in daemonset["volumes"]
        assert {
            "configMap": {"name": name + "-daemonset-config", "defaultMode": 0o600},
            "name": project + "-config",
        } in daemonset["volumes"]
    
>       assert {
            "name": "data",
            "hostPath": {
                "path": "/var/lib/" + name + "-default-data",
                "type": "DirectoryOrCreate",
            },
        } in daemonset["volumes"]
E       AssertionError: assert {'hostPath': {'path': '/var/lib/release-name-metricbeat-default-data', 'type': 'DirectoryOrCreate'}, 'name': 'data'} in [{'configMap': {'defaultMode': 384, 'name': 'release-name-metricbeat-daemonset-config'}, 'name': 'metricbeat-config'},...kersock'}, {'hostPath': {'path': '/proc'}, 'name': 'proc'}, {'hostPath': {'path': '/sys/fs/cgroup'}, 'name': 'cgroup'}]

tests/metricbeat_test.py:110: AssertionError
_____________________________________________________________________________ test_setting_fullnameOverride _____________________________________________________________________________

    def test_setting_fullnameOverride():
        config = """
    fullnameOverride: 'metricbeat-custom'
    """
        r = helm_template(config)
    
        custom_name = "metricbeat-custom"
        assert custom_name in r["daemonset"]
        assert (
            r["daemonset"][custom_name]["spec"]["template"]["spec"]["containers"][0]["name"]
            == project
        )
        assert (
            r["daemonset"][custom_name]["spec"]["template"]["spec"]["serviceAccountName"]
            == name
        )
        volumes = r["daemonset"][custom_name]["spec"]["template"]["spec"]["volumes"]
>       assert {
            "name": "data",
            "hostPath": {
                "path": "/var/lib/" + custom_name + "-default-data",
                "type": "DirectoryOrCreate",
            },
        } in volumes
E       AssertionError: assert {'hostPath': {'path': '/var/lib/metricbeat-custom-default-data', 'type': 'DirectoryOrCreate'}, 'name': 'data'} in [{'configMap': {'defaultMode': 384, 'name': 'metricbeat-custom-daemonset-config'}, 'name': 'metricbeat-config'}, {'hos...kersock'}, {'hostPath': {'path': '/proc'}, 'name': 'proc'}, {'hostPath': {'path': '/sys/fs/cgroup'}, 'name': 'cgroup'}]

tests/metricbeat_test.py:1133: AssertionError
========================================================================== 2 failed, 30 passed in 4.17 seconds ==========================================================================
../helpers/common.mk:36: recipe for target 'pytest' failed
make: *** [pytest] Error 1
➜  metricbeat git:(master) 

@fatmcgav
Copy link
Contributor

fatmcgav commented Jul 7, 2020

will the changes be backported to older versions? We're currently on 7.3.2.

Yes, the changes will be backported from master into the current release branches, which as of today are 7.x, 7.8 and 6.8.
So this fix would land into either the next 7.8 patch release, or 7.9 minor release.

I can't make the test pass.

Hmm, that's strange. Let me see if I can run them...

@fatmcgav
Copy link
Contributor

fatmcgav commented Jul 7, 2020

@erihanse OK, I was able to reproduce the test failure you're seeing locally...

The volume gets mapped on this line:

path: {{ .Values.hostPathRoot }}/{{ template "metricbeat.fullname" . }}-{{ .Release.Namespace }}-data

The important thing there is .Release.Namespace. Unless specified to helm, this will use whatever the configure namespace is within your local kubectl context.
So if you're in a namespace other than default, this is likely to fail.

I've spotted a fix and I'll get a PR up shortly, but in the meantime if you switch to the default namespace the tests should pass...

@fatmcgav
Copy link
Contributor

fatmcgav commented Jul 8, 2020

I've just merged #715 which includes a "fix" for the namespace issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request metricbeat
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants