Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tiny] Dockerfile: use tini to reap zombie processes #1002

Merged

Conversation

fflorent
Copy link
Collaborator

@fflorent fflorent commented May 29, 2024

Context

In our production instance doc workers, we can see hundreds of zombie processes. We would like to have them cleaned.

Proposed solution

We have in mind one of these solutions:

  • use --init option for the docker run command or shareProcessNamespace option for kubernetes;
  • use a tool like tini to reap zombie processes;

The latter option seems better so it can benefit to everyone using docker/podman or kubernetes without having to worry about the options to pass.

How to test?

Run the following command:

$ docker run --name grist -e GRIST_SANDBOX_FLAVOR=gvisor -ti -p 8484:8484 --rm gristlabs/grist

(please note that we don't add the --init option).

Open or create some document, force-reload the data engine.

Now if you open a shell (docker exec -ti grist /bin/bash), you should see zombie processes (marked with STAT=Z or STAT=Zs):

root@eb81d9203e56:/grist# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1 10.6  0.9 1000880 154888 pts/0  Ssl+ 09:54   0:02 node _build/stubs/app/server/server.js
root         189  0.0  0.0      0     0 ?        Zs   09:54   0:00 [exe] <defunct>
root         229  0.3  0.0      0     0 ?        Z    09:54   0:00 [exe] <defunct>
root         237  0.0  0.0   3868  3212 pts/1    Ss   09:54   0:00 /bin/bash
root         268  0.0  0.0      0     0 ?        Zs   09:54   0:00 [exe] <defunct>
root         309  2.2  0.0      0     0 ?        Z    09:54   0:00 [exe] <defunct>
root         331  0.0  0.0  17040 13100 pts/0    S+   09:54   0:00 python3 sandbox/gvisor/run.py -E PYTHONPATH=/grist/sandbox/grist -E PIPE_MODE=minimal -m /grist/sandbox --restore /tmp/engine__grist python3 -- /
root         334  0.0  0.1 730088 17948 pts/0    Sl+  09:54   0:00 runsc -root /tmp/runsc -unprivileged -ignore-cgroups -network none restore --image-path=/tmp/engine__grist _tmp_tmpax_u641x
root         340  0.0  0.1 756296 19136 pts/0    Sl+  09:54   0:00 runsc-gofer --root=/tmp/runsc --network=none --unprivileged=true --ignore-cgroups=true gofer --bundle /tmp/tmpax_u641x --spec-fd=3 --mounts-fd=4 
root         341  0.0  1.0 3012440 168676 ?      Ssl  09:54   0:00 runsc-sandbox --root=/tmp/runsc --network=none --unprivileged=true --ignore-cgroups=true boot --bundle=/tmp/tmpax_u641x --total-memory 1653697740
root         356  0.0  0.0      4     4 ?        tsl  09:54   0:00 [exe]
root         396  0.0  0.1  47616 30912 ?        tl   09:54   0:00 [exe]
root         403  0.0  0.0   7640  2680 pts/1    R+   09:54   0:00 ps aux

Now if you build the docker image and run it, you should not see zombie processes anymore:

root@4b9fe56f7592:/grist# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0   2284   756 pts/0    Ss   09:57   0:00 /usr/bin/tini -s -- ./sandbox/run.sh
root           7 11.8  0.9 1002856 153744 pts/0  Sl+  09:57   0:02 node _build/stubs/app/server/server.js
root         329  0.1  0.0   3868  3216 pts/1    Ss   09:58   0:00 /bin/bash
root         341  3.0  0.0  17040 13124 pts/0    S+   09:58   0:00 python3 sandbox/gvisor/run.py -E PYTHONPATH=/grist/sandbox/grist -E PIPE_MODE=minimal -m /grist/sandbox --restore /tmp/engine__grist python3 -- /
root         344  0.0  0.1 730088 17432 pts/0    Sl+  09:58   0:00 runsc -root /tmp/runsc -unprivileged -ignore-cgroups -network none restore --image-path=/tmp/engine__grist _tmp_tmpbcv0s1vq
root         350  1.0  0.1 756296 20320 pts/0    Sl+  09:58   0:00 runsc-gofer --root=/tmp/runsc --network=none --unprivileged=true --ignore-cgroups=true gofer --bundle /tmp/tmpbcv0s1vq --spec-fd=3 --mounts-fd=4 
root         351 48.0  0.9 2944788 154724 ?      Ssl  09:58   0:00 runsc-sandbox --root=/tmp/runsc --network=none --unprivileged=true --ignore-cgroups=true boot --bundle=/tmp/tmpbcv0s1vq --total-memory 1653697740
root         366  0.0  0.0      4     4 ?        tsl  09:58   0:00 [exe]
root         405  4.0  0.1  44760 29572 ?        tl   09:58   0:00 [exe]
root         411  0.0  0.0   7640  2712 pts/1    R+   09:58   0:00 ps aux

@fflorent fflorent changed the title Dockerfile: use tini to reap zombie processes [Tiny] Dockerfile: use tini to reap zombie processes May 29, 2024
@fflorent fflorent marked this pull request as ready for review May 29, 2024 07:20
@fflorent fflorent requested a review from paulfitz May 29, 2024 07:20
@paulfitz
Copy link
Member

Hi @fflorent, thanks for pointing out these zombie processes. I'd like to recruit someone to investigate why they are present. I just double checked and we don't see this behavior in our SaaS under the same conditions. Our SaaS Dockerfile runs Grist in a way we haven't touched for years, maybe there is now some significant difference with grist-core? Don't know yet.

I'd rather not put a band-aid on this without understanding it, however I get that this is a problem for you all, will try to be quick.

@jordigh jordigh self-assigned this May 29, 2024
@fflorent
Copy link
Collaborator Author

fflorent commented Jun 4, 2024

@paulfitz @jordigh FWIW, we activated the k8s option shareProcessNamespace, so for our own need this PR is not very urgent.

We have opened this PR so we avoid zombies while running docker images with the default options.

@jordigh
Copy link
Contributor

jordigh commented Jun 5, 2024

Zombie processes are technically something we are doing wrong, but this comment

I'm sure you could solve that in code, but again: why write it when you can just drop Tini in?

convinced me that the "right" solution isn't worth it. Merging this one in.

@jordigh jordigh merged commit a304b22 into gristlabs:main Jun 5, 2024
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants