uwsgi master graceful shutdown #1974

itielshwartz · 2019-01-31T16:22:04Z

I'm running uwsgi+flask application, The app is running as a k8s pod.

When i deploy a new pod (a new version), the existing pod get SIGTERM.

This causes the master to stop accepting new connection at the same moment, what causes issues as the LB still pass requests to the pod (for a few more seconds).

I would like the master to wait 30 sec BEFORE stop accepting new connections (When getting SIGTERM) but couldn't find a way, is it possible?

My uwsgi.ini file:

;https://uwsgi-docs.readthedocs.io/en/latest/HTTP.html
http = :8080
wsgi-file = main.py
callable = wsgi_application
processes = 2
enable-threads = true
master = true
reload-mercy = 30
worker-reload-mercy = 30
log-5xx = true
log-4xx = true
disable-logging = true
stats = 127.0.0.1:1717
stats-http = true
single-interpreter= true
;https://github.com/containous/traefik/issues/615
http-keepalive=true
add-header = Connection: Keep-Alive

Also asked on stackoverflow: https://stackoverflow.com/questions/54459949/uwsgi-master-graceful-shutdown but no luck

The text was updated successfully, but these errors were encountered:

xrmx · 2019-01-31T16:25:11Z

Shouldn't you kill the old version only after you deploy the new pod?

itielshwartz · 2019-01-31T16:59:57Z

Its part of k8s rolling update + the time it take him to remove the Pod from the endpoint list, can't control

xrmx · 2019-01-31T17:10:51Z

@itielshwartz a quick google search revelead there's readinessProbe to avoid your issue

itielshwartz · 2019-01-31T20:06:07Z

Hi @xrmx sadly this is not the case , the readiness probes doesn't run every second , and between the time k8s send sigterm to the time the prob there is a time uwsgi isn't working.

More then that after sigtrem the pod shouldn't get traffic , but it does as the ips table take time to update https://hackernoon.com/graceful-shutdown-in-kubernetes-435b98794461.

So like I said this feature is ineeded needed...

xrmx · 2019-01-31T20:28:05Z

The readiness probe on the new pod should avoid killing the old before the new is up, on the old it should make k8s stop routing the traffic to it. If that does not work you should probably fix your cluster configuration no? Adding an arbitrary delay on process teardown seems a hack.

itielshwartz · 2019-01-31T20:39:09Z

Hi @xrmx did you read the link I sent - https://hackernoon.com/graceful-shutdown-in-kubernetes-435b98794461.?
After the old pod is getting SIGTERM (as it should be replaced, and a new pod is already up) it doesn't matter about the readniess prob, he(old pod) shouldn't get any traffic - but he still might (and is) getting for a few seconds, it is a known issue, not my cluster not my conf.

It is (maybe) a hack, but a one that is not really custom to me (it is important to anyone who want 100% uptime when rolling a new version with k8s)

xrmx · 2019-01-31T20:48:48Z

@itielshwartz That's a post from 2017, it does not feel that authoritative sorry. Again, if you are sending SIGTERM and still sending traffic does not look right to me. But hey I'm not here to convince you.

itielshwartz · 2019-01-31T20:59:38Z

@xrmx I totally agree with

if you are sending SIGTERM and still sending traffic does not look right to me

But as this is k8s deafult i don't think theres any choise.

i just wanted to know is uwsgi can provide me a solution, one of the reason i opened this issue is beacuse (i guess) other people will face the same problem...

If uwsgi has no solution, i will use the k8s preStop hook, feel more hacky to me (as i can't test it locally, and it's very k8s), but it will solve the issue.

about the post, it is indeed from 2017, but as the first commenter stated:

The behavior you’re seeing is caused by lag in kube-proxy, and it’s apparently by design.

Kubernetes cannot, by definition, maintain perfect consistency in a distributed scenario where network partitions are common. Kubernetes chooses to be eventually consistent here, in order to be available and partition-tolerant.

The usual solution is to add a “preStop” hook to the pod that sleeps for, say, 3 seconds. This allows the pod to wait until it’s properly deregistered. It feels hacky, sure, but it works. Of course, this doesn’t solve what to do with persistent connections.
Long discussion in this Github issue.

Also heres the ongoing issue he mentioned:

kubernetes-retired/contrib#1140 (comment)

Just ran into this issue - at the very least this should be better documented, I'm sure the vast majority of people using rolling updates believe that kubernetes waits until a pod is out of rotation before sending SIGTERM, and this is not the case!

I also think that even if not implemented right away, sequencing SIGTERM to happen after a pod has been taken out of rotation should be the eventual goal: I don't think saying "gunicorn is doing the wrong thing" is a useful response, that's just kubernetes redefining what it thinks these signals mean, gunicorn and other servers behave in a fairly standard way wrt signals, and kubernetes should aim to be compatible with them rather than offloading the problem onto users.

Anyway thx for trying to help (and for uwsgitop!)

xrmx · 2019-01-31T21:53:39Z

BTW if you are on uwsgi 2.0 you should add die-on-term to exit and not reload on SIGTERM

itielshwartz · 2019-02-01T06:38:18Z

Thanks I'm using it :)
I tried to remove it while experimenting.

Anyway issue is resovled (will use hooks) thanks

amramtamir · 2019-09-20T08:52:25Z

Hi @itielshwartz , can you share your solution with the hooks?
Thanks!

itielshwartz · 2019-09-22T12:53:49Z

Hi, @amramtamir check the stackoverflow answer:
https://stackoverflow.com/questions/54459949/uwsgi-master-graceful-shutdown

itielshwartz closed this as completed Feb 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uwsgi master graceful shutdown #1974

uwsgi master graceful shutdown #1974

itielshwartz commented Jan 31, 2019 •

edited

Loading

xrmx commented Jan 31, 2019

itielshwartz commented Jan 31, 2019 •

edited

Loading

xrmx commented Jan 31, 2019 •

edited

Loading

itielshwartz commented Jan 31, 2019 •

edited

Loading

xrmx commented Jan 31, 2019

itielshwartz commented Jan 31, 2019

xrmx commented Jan 31, 2019

itielshwartz commented Jan 31, 2019 •

edited

Loading

xrmx commented Jan 31, 2019

itielshwartz commented Feb 1, 2019

amramtamir commented Sep 20, 2019

itielshwartz commented Sep 22, 2019

uwsgi master graceful shutdown #1974

uwsgi master graceful shutdown #1974

Comments

itielshwartz commented Jan 31, 2019 • edited Loading

xrmx commented Jan 31, 2019

itielshwartz commented Jan 31, 2019 • edited Loading

xrmx commented Jan 31, 2019 • edited Loading

itielshwartz commented Jan 31, 2019 • edited Loading

xrmx commented Jan 31, 2019

itielshwartz commented Jan 31, 2019

xrmx commented Jan 31, 2019

itielshwartz commented Jan 31, 2019 • edited Loading

xrmx commented Jan 31, 2019

itielshwartz commented Feb 1, 2019

amramtamir commented Sep 20, 2019

itielshwartz commented Sep 22, 2019

itielshwartz commented Jan 31, 2019 •

edited

Loading

itielshwartz commented Jan 31, 2019 •

edited

Loading

xrmx commented Jan 31, 2019 •

edited

Loading

itielshwartz commented Jan 31, 2019 •

edited

Loading

itielshwartz commented Jan 31, 2019 •

edited

Loading