You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These are some parameters we currently have in our current internal model serving platform that would be very beneficial for seldon.
http://docs.gunicorn.org/en/stable/settings.html#preload-app --preload allows you to initialize the model once and share it between all your gunicorn workers. This can save a lot of Memory if your model is particularly huge (or if you need to load lots of data as part of init)
If the model is loaded in __init__ it serves the same purpose as preload.
http://docs.gunicorn.org/en/stable/settings.html#max-requests --max-requests and --max-requests-jitter will restart the worker every N + random(M) requests. gunicorn/flask seems to have some sort of issue of memory increasing as they process requests. We could not figure out where the mem leak was happening so we opted to use this instead.
The text was updated successfully, but these errors were encountered:
kparaju
changed the title
Allow --preload, --max-requests and --max-requests jitter parameters for python wrapper
Allow --max-requests and --max-requests jitter parameters for python wrapper
Oct 4, 2019
I think the issue is if the model is a Tensorflow model then its not so easy to share the graph between running Gunicorn workers so the current code calls load in the thread of the worker.
On Tue, 8 Oct 2019, 17:52 Kshitij Parajuli, ***@***.***> wrote:
FYI, I'll put a PR out for this today.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#911?email_source=notifications&email_token=ACQS4A3YYQGLZVRI7VEERKLQNS3ETA5CNFSM4I5ULJF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAU3NXQ#issuecomment-539604702>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACQS4A4AU7MIBSL2TSXIN5DQNS3ETANCNFSM4I5ULJFQ>
.
kparaju
added a commit
to kparaju/seldon-core
that referenced
this issue
Oct 8, 2019
These are some parameters we currently have in our current internal model serving platform that would be very beneficial for seldon.
http://docs.gunicorn.org/en/stable/settings.html#preload-app--preload
allows you to initialize the model once and share it between all your gunicorn workers. This can save a lot of Memory if your model is particularly huge (or if you need to load lots of data as part of init)If the model is loaded in
__init__
it serves the same purpose as preload.http://docs.gunicorn.org/en/stable/settings.html#max-requests
--max-requests
and--max-requests-jitter
will restart the worker every N + random(M) requests. gunicorn/flask seems to have some sort of issue of memory increasing as they process requests. We could not figure out where the mem leak was happening so we opted to use this instead.The text was updated successfully, but these errors were encountered: