You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have 10 GPUs on one server which I want them all used when serving a model. Here's some awkward walk-arounds which I have tried and I am looking forward to a better one.
When exporting the model, I assign 10 inputs placeholder and 10 outputs to Gpu 0-9, each has identical graph, and I set 10 signature_def_map:
After that I use one model_server to serve it. And my client sends polling requests asynchronously with different request.model_spec.signature_name.
This method does make use of multi GPUs but also obviously the average efficiency of each GPU is lower than that when only one GPU is working.
So next I tried using ten model_server to serve one model. Each model_server is set to use a specific GPU. By setting different port in client, I can also implement using 10 GPUs at the same time.However it is more difficult to manage ten servers than one.
Is there anyone who also has similar application requirements and could give me some advice?
Another problem is load balancing, which I think has a lot to do with the ways of implementing multi GPUs. I haven't think out a good way to do this.
The text was updated successfully, but these errors were encountered:
I have 10 GPUs on one server which I want them all used when serving a model. Here's some awkward walk-arounds which I have tried and I am looking forward to a better one.
When exporting the model, I assign 10 inputs placeholder and 10 outputs to Gpu 0-9, each has identical graph, and I set 10 signature_def_map:
After that I use one model_server to serve it. And my client sends polling requests asynchronously with different
request.model_spec.signature_name
.This method does make use of multi GPUs but also obviously the average efficiency of each GPU is lower than that when only one GPU is working.
So next I tried using ten model_server to serve one model. Each model_server is set to use a specific GPU. By setting different port in client, I can also implement using 10 GPUs at the same time.However it is more difficult to manage ten servers than one.
Is there anyone who also has similar application requirements and could give me some advice?
Another problem is load balancing, which I think has a lot to do with the ways of implementing multi GPUs. I haven't think out a good way to do this.
The text was updated successfully, but these errors were encountered: