health_status events are too noisy/redundant #24003

harish2704 · 2024-09-18T17:06:16Z

Issue Description

the health_status events emitted by podman is too noisy.
- That is, health_status event is emitted for each health check attempts . I think it should only be emitted when ever there is a change in health_status status of a container.
Docker handles this case in better way. Docker is emitting different events (exec_create and exec_die) for health check attempts and only emit health_status event whenever there is an actual status/state change.
- In the above way other applications which implements docker/podman service discovery can safely and easly depend on docker/podman for service discovery
- For eg: tools like Traefik proxy implements docker/podman service discovery by reloading confiuration for every health_status event.
- Because of the noisy behaviour, there will be considerable CPU resource utilization if we try to use tools like Traefik along with podman.
- I am not raising this bug report just because podman is not implementing docker's behaviour, but I believe docker is handling this case in better way than podman

As a secure container runtime, I see very big opportunity for Podman . Bugs like this will prevent people from switching to podman .
I found this issue while using Coolify along with podman where considerable CPU resource is wasted due to this bug

Steps to reproduce the issue

clone https://github.com/harish2704/podman-health-check-bug.
run cd for-podman
run podman-compose up -d
watch events emitted by podman by running podman events --format json | jq -r '[.time, .Name, .Status, .health_status] | @tsv'

Describe the results you received

The output I am seeing is given below.

1726677092      for-podman_site1_1      health_status   healthy
1726677092      for-podman_site0_1      health_status   healthy
1726677093      for-podman_site2_1      health_status   healthy
1726677093      for-podman_site1_1      health_status   healthy
1726677093      for-podman_site0_1      health_status   healthy
1726677094      for-podman_site2_1      health_status   healthy
1726677094      for-podman_site1_1      health_status   healthy
1726677094      for-podman_site0_1      health_status   healthy
1726677096      for-podman_site1_1      health_status   healthy
1726677096      for-podman_site0_1      health_status   healthy
1726677096      for-podman_site2_1      health_status   healthy
1726677098      for-podman_site2_1      health_status   healthy
1726677098      for-podman_site0_1      health_status   healthy
1726677098      for-podman_site1_1      health_status   healthy
1726677100      for-podman_site2_1      health_status   healthy
1726677102      for-podman_site0_1      health_status   healthy
1726677102      for-podman_site1_1      health_status   healthy
1726677103      for-podman_site2_1      health_status   healthy
1726677103      for-podman_site0_1      health_status   healthy
1726677103      for-podman_site1_1      health_status   healthy
1726677105      for-podman_site2_1      health_status   unhealthy
1726677105      for-podman_site1_1      health_status   unhealthy
1726677105      for-podman_site0_1      health_status   unhealthy
1726677106      for-podman_site1_1      health_status   unhealthy
1726677106      for-podman_site2_1      health_status   unhealthy
1726677107      for-podman_site0_1      health_status   unhealthy
1726677108      for-podman_site0_1      health_status   unhealthy
1726677108      for-podman_site1_1      health_status   unhealthy
1726677108      for-podman_site2_1      health_status   unhealthy
1726677110      for-podman_site2_1      health_status   unhealthy

In the above , we can see that most of the health_status events are duplicates. ( doesn't contain any new information )

Describe the results you expected

I am expecting the health_status event only when there is a status/change change

podman info output

host:
  arch: amd64
  buildahVersion: 1.37.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-2.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: '
  cpuUtilization:
    idlePercent: 96.27
    systemPercent: 1.64
    userPercent: 2.09
  cpus: 12
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: kde
    version: "40"
  eventLogger: journald
  freeLocks: 2012
  hostname: fedora-desk
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.10.9-200.fc40.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 337436672
  memTotal: 16680095744
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.2-2.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.2
    package: netavark-1.12.2-1.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.17-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.17
      commit: 000fa0d4eeed8938301f3bcf8206405315bc1017
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240906.g6b38f07-1.fc40.x86_64
    version: |
      pasta 0^20240906.g6b38f07-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-2.fc40.x86_64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.5
  swapFree: 7665238016
  swapTotal: 8589930496
  uptime: 28h 59m 43.00s (Approximately 1.17 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/harish/.config/containers/storage.conf
  containerStore:
    number: 20
    paused: 0
    running: 4
    stopped: 16
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/harish/.local/share/containers/storage
  graphRootAllocated: 236221104128
  graphRootUsed: 91409645568
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 50
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/harish/.local/share/containers/storage/volumes
version:
  APIVersion: 5.3.0-dev
  Built: 1726676363
  BuiltTime: Wed Sep 18 21:49:23 2024
  GitCommit: 62c101651ff85daa0370d5031b9e2b3b4c5f16be
  GoVersion: go1.22.6
  Os: linux
  OsArch: linux/amd64
  Version: 5.3.0-dev

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

My OS

NAME="Fedora Linux"
VERSION="40 (KDE Plasma)"

Additional information

No response

The text was updated successfully, but these errors were encountered:

Emit event only if there is a change in health_status Fixes containers#24003

Emit event only if there is a change in health_status Fixes containers#24003 Signed-off-by: Harish Karumuthil <harish2704@gmail.com>

Emit event only if there is a change in health_status Fixes containers#24003 Signed-off-by: Harish Karumuthil <harish2704@gmail.com> Resolves containers#24005 (comment) Pass additional isChanged flag to event creation function Fix health check events for docker api

Emit event only if there is a change in health_status Fixes containers#24003 Resolves containers#24005 (comment) Pass additional isChanged flag to event creation function Fix health check events for docker api Signed-off-by: Harish Karumuthil <harish2704@gmail.com>

harish2704 added the kind/bug Categorizes issue or PR as related to a bug. label Sep 18, 2024

harish2704 added a commit to harish2704/podman that referenced this issue Sep 18, 2024

Fix: Do not emit health_status event for each health check attempt.

14dbf82

Emit event only if there is a change in health_status Fixes containers#24003

harish2704 linked a pull request Sep 18, 2024 that will close this issue

Fix: Do not emit health_status event for each health check attempt. #24005

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

health_status events are too noisy/redundant #24003

health_status events are too noisy/redundant #24003

harish2704 commented Sep 18, 2024 •

edited

Loading

health_status events are too noisy/redundant #24003

health_status events are too noisy/redundant #24003

Comments

harish2704 commented Sep 18, 2024 • edited Loading

Issue Description

Steps to reproduce the issue

Describe the results you received

Describe the results you expected

podman info output

Podman in a container

Privileged Or Rootless

Upstream Latest Release

Additional environment details

Additional information

harish2704 commented Sep 18, 2024 •

edited

Loading