Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move from a custom tornado server to a seldon-core server for serving the model #36

Merged
merged 2 commits into from
Mar 9, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,7 @@ examples/.ipynb_checkpoints/

# pyenv
.python-version

# Data files
*.h5
*.dpkl
2 changes: 1 addition & 1 deletion github_issue_summarization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ By the end of this tutorial, you should learn how to:
datasets
* Train a Sequence-to-Sequence model using TensorFlow on the cluster using
GPUs
* Serve the model using a Tornado Server
* Serve the model using [Seldon Core](https://github.com/SeldonIO/seldon-core/)

## Steps:

Expand Down
22 changes: 22 additions & 0 deletions github_issue_summarization/notebooks/IssueSummarization.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from __future__ import print_function

import dill as dpickle
import numpy as np
from keras.models import load_model

from seq2seq_utils import Seq2Seq_Inference


class IssueSummarization(object):

def __init__(self):
with open('body_pp.dpkl', 'rb') as f:
body_pp = dpickle.load(f)
with open('title_pp.dpkl', 'rb') as f:
title_pp = dpickle.load(f)
self.model = Seq2Seq_Inference(encoder_preprocessor=body_pp,
decoder_preprocessor=title_pp,
seq2seq_model=load_model('seq2seq_model_tutorial.h5'))

def predict(self, X, feature_names):
return np.asarray([[self.model.generate_issue_title(body[0])[1]] for body in X])
11 changes: 11 additions & 0 deletions github_issue_summarization/notebooks/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
numpy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you don't mind please use pipenv

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping a seldon-core microservice requires a requirements.txt file: https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python.md#create-a-model-folder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimexist might want to open an issue in kubeflow/kubeflow or Seldon to follow up with them. @cliveseldon

keras
dill
matplotlib
tensorflow
annoy
tqdm
nltk
IPython
ktext
h5py
54 changes: 0 additions & 54 deletions github_issue_summarization/notebooks/server.py

This file was deleted.

55 changes: 48 additions & 7 deletions github_issue_summarization/serving_the_model.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,62 @@
# Serving the model

We are going to use a simple tornado server to serve the model. The [server.py](notebooks/server.py) contains the server code.
We are going to use [seldon-core](https://github.com/SeldonIO/seldon-core) to serve the model. [IssueSummatization.py](notebooks/IssueSummatization.py) contains the code for this model. We will wrap this class into a seldon-core microservice which we can then deploy as a REST or GRPC API server.

Start the server using `python server.py --port=8888`.
> The model is written in Keras and when exported as a TensorFlow model seems to be incompatible with TensorFlow Serving. So we're using seldon-core to serve this model since seldon-core allows you to serve any arbitrary model. More details [here](https://github.com/kubeflow/examples/issues/11#issuecomment-371005885).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to open up an issue to track models that TFServing doesn't work with so we can loop in various folks e.g. from TFServing team. Its unfortunate that we can't use TFServing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done #38


> The model is written in Keras and when exported as a TensorFlow model seems to be incompatible with TensorFlow Serving. So we're using our own webserver to serve this model. More details [here](https://github.com/kubeflow/examples/issues/11#issuecomment-371005885).
# Prerequisites

## Sample request
Ensure that you have the following files from the [training](training_the_model.md) step in your `notebooks` directory:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why notebooks directory? What if we trained the model by running a TFJob and the model is stored on GCS/S3?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if they ran the TFJOb on GCS they could just do gsutil to copy from the bucket to their pod.


* `seq2seq_model_tutorial.h5` - the keras model
* `body_pp.dpkl` - the serialized body preprocessor
* `title_pp.dpkl` - the serialized title preprocessor

# Wrap the model into a seldon-core microservice

cd into the notebooks directory and run the following docker command. This will create a build/ directory.

```
cd notebooks/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Seldon's model wrapper support reading the model from GCS/S3?

Could we run the python wrapper as a job on K8s? Could we create a ksonnet template to make it easy to launch such a job.

I don't think we should try to address either of those issues in this PR. But might be good to open up issues to track them.

/cc @cliveseldon @gsunner @ahousley

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Seldon's model wrapper support reading the model from GCS/S3?

Not that I am aware of. Looking at the instructions, it seems to require all the files in a single folder which contains the code and the model to build the serving image.

Created an issue to track the feature for serving a seldon-core model: kubeflow/kubeflow#405

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a seldon component in kubeflow. In the next PR, I'll try to deploy this image on k8s using this component

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can specify any persistent volume in definition of the SeldonDeployment in the componentSpec as its a standard Kubernetes PodTemplateSpec. So you can load your parameters from their rather than package them as part of the model image if you wish. The kubeflow-seldon example shows an example using an NFS volume to load the trained model parameters.

docker run -v $(pwd):/my_model seldonio/core-python-wrapper:0.7 /my_model IssueSummarization 0.1 seldonio --base-image=python:3.6
```

The build/ directory contains all the necessary files to build the seldon-core microservice image

```
cd build/
./build_image.sh
```

Now you should see an image named `seldonio/issuesummarization:0.1` in your docker images. To test the model, you can run it locally using
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we tag it into a repo like GCR/DockerHub/Quay? We'll need that to deploy the model on K8s.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use a GCR tag.


`docker run -p 5000:5000 seldonio/issuesummarization:0.1`

> You can find more details about wrapping a model with seldon-core [here](https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python.md)

## Sample request and response

Request

```
curl -X POST -H 'Content-Type: application/json' -d '{"instances": ["issue overview add a new property to disable detection of image stream files those ended with -is.yml from target directory. expected behaviour by default cube should not process image stream files if user does not set it. current behaviour cube always try to execute -is.yml files which can cause some problems in most of cases, for example if you are using kuberentes instead of openshift or if you use together fabric8 maven plugin with cube"]}' http://localhost:8888/predict
curl -X POST -d 'json={"data":{"ndarray":[["issue overview add a new property to disable detection of image stream files those ended with -is.yml from target directory. expected behaviour by default cube should not process image stream files if user does not set it. current behaviour cube always try to execute -is.yml files which can cause some problems in most of cases, for example if you are using kuberentes instead of openshift or if you use together fabric8 maven plugin with cube"]]}}' http://localhost:5000/predict
```

## Sample response
Response

```
{"predictions": ["add a new property to disable detection"]}
{
"data": {
"names": [
"t:0"
],
"ndarray": [
[
"add a new property to disable detection"
]
]
}
}
```

Next: [Teardown](teardown.md)