Problems of using multi GPUs and implementing load balance among them #859

Li-Shu14 · 2018-04-20T03:59:09Z

I have 10 GPUs on one server which I want them all used when serving a model. Here's some awkward walk-arounds which I have tried and I am looking forward to a better one.
When exporting the model, I assign 10 inputs placeholder and 10 outputs to Gpu 0-9, each has identical graph, and I set 10 signature_def_map:

feature_inputs = []
predict_outputs = []
prediction_signature = {}
signature_def_map = {}
method_name = tf.saved_model.signature_constants.PREDICT_METHOD_NAME
tag = [tf.saved_model.tag_constants.SERVING]

for i in range(10):
    feature_inputs.append(tf.saved_model.utils.build_tensor_info(image_placeholder[i]))
    predict_outputs.append(tf.saved_model.utils.build_tensor_info(predict[i]))
    prediction_signature[i] = tf.saved_model.signature_def_utils.build_signature_def(
        inputs={'input_feature': feature_inputs[i]},
        outputs={'output': predict_outputs[i]},
        method_name=method_name)
    signature_def_map['prediction'+str(i)] = prediction_signature[i]

builder.add_meta_graph_and_variables(
    sess=sess,
    tags=tag,
    signature_def_map=signature_def_map)

After that I use one model_server to serve it. And my client sends polling requests asynchronously with different request.model_spec.signature_name.
This method does make use of multi GPUs but also obviously the average efficiency of each GPU is lower than that when only one GPU is working.
So next I tried using ten model_server to serve one model. Each model_server is set to use a specific GPU. By setting different port in client, I can also implement using 10 GPUs at the same time.However it is more difficult to manage ten servers than one.
Is there anyone who also has similar application requirements and could give me some advice?

Another problem is load balancing, which I think has a lot to do with the ways of implementing multi GPUs. I haven't think out a good way to do this.

The text was updated successfully, but these errors were encountered:

gautamvasudevan · 2018-08-07T22:58:57Z

Duplicate of #311

gautamvasudevan marked this as a duplicate of #311 Aug 7, 2018

gautamvasudevan closed this as completed Aug 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems of using multi GPUs and implementing load balance among them #859

Problems of using multi GPUs and implementing load balance among them #859

Li-Shu14 commented Apr 20, 2018 •

edited

Loading

gautamvasudevan commented Aug 7, 2018

Problems of using multi GPUs and implementing load balance among them #859

Problems of using multi GPUs and implementing load balance among them #859

Comments

Li-Shu14 commented Apr 20, 2018 • edited Loading

gautamvasudevan commented Aug 7, 2018

Li-Shu14 commented Apr 20, 2018 •

edited

Loading