-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GH Issue Summarization] Create a model server #11
Comments
/assign @ankushagarwal Ankush can you describe the problems you were running into turning the model into a model that can be served with TF serving? Would it be easier to serve the model using Seldon? |
The model used for issue summarization is very different from the examples that we've been using. For our image models, the model prediction looks something like this: But for the issue summarization model, it looks something like this
The first issue that I had was - exporting Keras models as Tensorflow models which can be used by TFServing - this is mostly done. The second challenge that I have is understanding how TFServing works with
|
I am having issues with the model exported from keras imported into tfserving. I get this error when I send a Prediction request to the TFServing server
Could not find a workaround for this. Will give seldon or tornado a shot to serve this Keras model. We can probably illustrate serving a model with TFServing in another example which trains a tensorflow model directly. |
Create a simple tornado server to serve the model TODO: Create a docker image for the server and deploy on kubeflow Related to #11
@cliveseldon @gsunner Do you think we should try to use Seldon here? Could we use the existing Seldon model server rather than creating our own Tornado stub? Should we deploy the model using Seldon Core rather than deploying it directly with K8s resources? |
@jlewi For a sklearn model, seldon-core would seem to be a good choice. @ankushagarwal You specify a seq-to-seq model, but does the external business app send the whole sequence of characters in a single request to get a sequence back? if so then that should fit fine into the seldon-core prediction payload using NDArray. Your prediction component would need to split the request and then do as you specify in pseudo-code above. Suggest you look at https://github.com/kubeflow/example-seldon which contains a sklearn model in the example code. @jlewi Not sure I follow your last two questions. It would seem preferable to use the most appropriate serving solution TfServing or seldon-core rather than starting to build a new serving solution. |
Hi @cliveseldon I have followed the instructions at https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python.md and wrapped my model into a docker image. I am able to run the image locally and it is serving a REST API server at port 5000. My question is what is the API to send a prediction request to the server. I could not find docs on that. |
|
Update the issue summarization end to end tutorial to deploy the seldon core model to the k8s cluster Update the sample request and response Related to #11
Closing since we have a seldon model server. |
Create a model server using TFServing.
Component of #14.
The text was updated successfully, but these errors were encountered: