Optimize gunicorn settings running with docker #30

TimMcCauley · 2018-04-14T08:24:39Z

Sporadically the gunicorn workers time out - this may be due to the worker class settings: http://docs.gunicorn.org/en/stable/settings.html

Replacing sync with gevent, closes #30

TimMcCauley · 2019-05-22T13:04:03Z

https://pythonspeed.com/articles/gunicorn-in-docker/

TimMcCauley · 2019-05-22T13:06:40Z

@zephylac have you any experience with gunicorn settings? Sometimes requests are timing out on our live servers using the following settings:

workers = 2
worker_class = 'gevent'
worker_connections = 1000
timeout = 30
keepalive = 2

I now am trying the following settings instead which are recommended in the post above.

worker_class = 'gthread'
threads = 4

zephylac · 2019-05-22T13:13:37Z

I don't have any experience with gunicorn but I can try to have a look into it and find some info.

I'm currently spamming my instance with request but I didn't experienced any timeout (for now).

zephylac · 2019-05-22T14:03:17Z

I've looked into it a little bit.
In the article you mentionned they were also talking about --worker-tmp-dir which might cause problems to workers.

I've already seen some info about threads option. Opinions seemed to converged to threads = workers.
It seems that the ‘(solution)[https://www.brianstorti.com/the-role-of-a-reverse-proxy-to-protect-your-application-against-slow-clients/]’ some found was to expose NGINX in front of gunicorn.

On my side I've tried to timeout my workers (without changing current gunicorn parameters). On both extreme load or rest, my workers don't seem to timeout.

TimMcCauley · 2019-05-22T15:14:35Z

Thanks for looking this up @zephylac - if you are running your batch requests, could you also run them against api.openrouteservice.org at the same time? I can send you a token allowing a higher quota - if you agree - which email could I send the token to?

zephylac · 2019-05-22T15:56:43Z

I've sent you an email !

zephylac · 2019-05-23T12:56:56Z

Under which architecture are you running your service ? Are you using docker ? Are you running on VM or dedicated ?

TimMcCauley · 2019-05-23T13:16:48Z

We are running this on a VM in our openstack environment with 32GB RAM and 8 cores. The postgis database is running on a different and smaller VM with unfortunately with very slow disks (which soon will be updated to SSDs). The containers running on this VM are

ubuntu@ors-microservices:~|⇒  sudo docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS                      NAMES
68404976f9d6        openelevationservice_gunicorn_flask_2      "/oes_venv/bin/gun..."   8 weeks ago         Up 2 days           0.0.0.0:5021->5000/tcp     openelevationservice_gunicorn_flask_2_1
6959766a7ee9        openelevationservice_gunicorn_flask        "/oes_venv/bin/gun..."   8 weeks ago         Up 2 days           0.0.0.0:5020->5000/tcp     openelevationservice_gunicorn_flask_1
ec736d4cd30c        openpoiservice_gunicorn_flask_05122018_2   "/ops_venv/bin/gun..."   5 months ago        Up 24 hours         0.0.0.0:5006->5000/tcp     openpoiservice_gunicorn_flask_05122018_2_1
c62417a4f60e        openpoiservice_gunicorn_flask_05122018     "/ops_venv/bin/gun..."   5 months ago        Up 24 hours         0.0.0.0:5005->5000/tcp     openpoiservice_gunicorn_flask_05122018_1

zephylac · 2019-05-23T13:34:29Z

Does the workers are timing out even on idle ? Or just under load ?

I've looked on my logs, none of my workers have timed out during 1 week of intense load.

TimMcCauley · 2019-05-23T20:34:42Z

Some requests will simply timeout but I haven't found a pattern for this yet.

zephylac · 2019-06-06T15:19:35Z

Maybe PostgreSQL12 & PostGIS 3 will fix a part of this issue by supporting correctly the parallelization.

TimMcCauley · 2019-06-07T20:33:22Z

Agreed. Did you test the live API with the token I sent you by any chance @zephylac ?

zephylac · 2019-06-09T08:51:35Z

Yup I tried but it seems it has expired.

TimMcCauley · 2019-06-09T17:06:22Z

Ah shit, sorry - it's now extended forever ;-) and won't expire anymore (same token as in the email).

boind12 · 2020-11-10T18:33:10Z

Hi @TimMcCauley,
obviously it has been a while, but as I am facing the same issue you described (random timeouts with larger batches of POI requests using docker) I am wondering, if you have found a solution?

lingster · 2020-11-14T11:36:22Z

maybe this topic might help? https://pythonspeed.com/articles/gunicorn-in-docker/

boind12 · 2020-11-15T15:37:47Z

Hi @lingster,
this link was mentioned earlier by Tim. I was unable to solve the problem using it.

TimMcCauley · 2020-11-15T18:07:09Z

Sorry for joining the party so late.

@boind12 could you run ANALYZE in the ops schema once and check again? What kind of requests are you running and are you able to do the same directly in SQL and see how it behaves (you can print the sql query and fill the placeholders manually)? How much memory are you giving Docker and have you played around with pgtune settings? In a nutshell: it's most likely a postgres issue.

boind12 · 2020-11-17T07:51:03Z

Hi @TimMcCauley,
thanks for your support!
I am using the following setup:

Host: 16GB, 2vCPU with 50GB SSD (Google Cloud e2-highmem)
The host is running:
- 1x Openrouteservice: https://github.com/GIScience/openrouteservice
- 1x Openpoiservice
- 1x postgis: https://hub.docker.com/r/kartoza/postgis/
  I am running large batch request for POIs with >50km2 area, hence I assume it takes some of them longer then the 30s timeout of the gunicorn from openpoiservice. By increasing the timeout of the gunicorn runner to 60s I was able to solve the issue.
  However I now migrated the postgis from the VM to a dedicated Google PostGre instance. Maybe this helps further.

TimMcCauley added the flask|python label Apr 14, 2018

TimMcCauley self-assigned this Apr 14, 2018

TimMcCauley closed this as completed in 37a2bbf Aug 7, 2018

TimMcCauley added a commit that referenced this issue Aug 7, 2018

Merge pull request #42 from GIScience/development

23aabc2

Replacing sync with gevent, closes #30

TimMcCauley reopened this May 22, 2019

TimMcCauley changed the title ~~Understand worker_class of gunicorn~~ Optimize gunicorn settings running with docker May 22, 2019

TimMcCauley added the infrastructure label May 22, 2019

cdubz mentioned this issue Jun 25, 2021

WORKER TIMEOUT on Docker 20.x babybuddy/babybuddy#227

Closed

sushilkhadka165 mentioned this issue Dec 22, 2022

Optimize deployments by reducing gunicorn workers. atharvatechnology/apex-backend#372

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize gunicorn settings running with docker #30

Optimize gunicorn settings running with docker #30

TimMcCauley commented Apr 14, 2018

TimMcCauley commented May 22, 2019

TimMcCauley commented May 22, 2019

zephylac commented May 22, 2019 •

edited

Loading

zephylac commented May 22, 2019

TimMcCauley commented May 22, 2019

zephylac commented May 22, 2019

zephylac commented May 23, 2019

TimMcCauley commented May 23, 2019

zephylac commented May 23, 2019 •

edited

Loading

TimMcCauley commented May 23, 2019

zephylac commented Jun 6, 2019

TimMcCauley commented Jun 7, 2019

zephylac commented Jun 9, 2019

TimMcCauley commented Jun 9, 2019

boind12 commented Nov 10, 2020

lingster commented Nov 14, 2020

boind12 commented Nov 15, 2020

TimMcCauley commented Nov 15, 2020

boind12 commented Nov 17, 2020

Optimize gunicorn settings running with docker #30

Optimize gunicorn settings running with docker #30

Comments

TimMcCauley commented Apr 14, 2018

TimMcCauley commented May 22, 2019

TimMcCauley commented May 22, 2019

zephylac commented May 22, 2019 • edited Loading

zephylac commented May 22, 2019

TimMcCauley commented May 22, 2019

zephylac commented May 22, 2019

zephylac commented May 23, 2019

TimMcCauley commented May 23, 2019

zephylac commented May 23, 2019 • edited Loading

TimMcCauley commented May 23, 2019

zephylac commented Jun 6, 2019

TimMcCauley commented Jun 7, 2019

zephylac commented Jun 9, 2019

TimMcCauley commented Jun 9, 2019

boind12 commented Nov 10, 2020

lingster commented Nov 14, 2020

boind12 commented Nov 15, 2020

TimMcCauley commented Nov 15, 2020

boind12 commented Nov 17, 2020

zephylac commented May 22, 2019 •

edited

Loading

zephylac commented May 23, 2019 •

edited

Loading