From afc53174ec304dbb6ecd3f3c06296558e031ab18 Mon Sep 17 00:00:00 2001
From: Narendranath Nadig <35164475+narennadig@users.noreply.github.com>
Date: Mon, 31 Aug 2020 20:28:52 +0530
Subject: [PATCH] Add files via upload
---
examples/models/statsmodels/statsmodels.ipynb | 402 ++++++++++++++++++
1 file changed, 402 insertions(+)
create mode 100644 examples/models/statsmodels/statsmodels.ipynb
diff --git a/examples/models/statsmodels/statsmodels.ipynb b/examples/models/statsmodels/statsmodels.ipynb
new file mode 100644
index 0000000000..597f6182a2
--- /dev/null
+++ b/examples/models/statsmodels/statsmodels.ipynb
@@ -0,0 +1,402 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Deploying Time-Series Models on Seldon "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The following notebook are steps to deploy your first time-series model on Seldon. The first step is to install statsmodels on our local system, along with s2i. s2i will be used to convert the source code to a docker image and stasmodels is a python library to build statistical models. \n",
+ "\n",
+ "Dependencies:\n",
+ "\n",
+ "1. Seldon-core (https://docs.seldon.io/projects/seldon-core/en/v1.1.0/workflow/install.html) \n",
+ "\n",
+ "2. s2i - Source to Image (https://rb.gy/jgybo9)\n",
+ "\n",
+ "3. statsmodels (https://www.statsmodels.org/stable/index.html) \n",
+ "\n",
+ "\n",
+ "\n",
+ "Assuming you have installed statsmodels and s2i, the next step is to create a joblib file of your time-series model. The sample code is given below . Here we have considered a Holt- Winter's seasonal model and the shampoo sales dataset as a basic example. \n",
+ " \n",
+ " \n",
+ "The univariate dataset : https://raw.githubusercontent.com/jbrownlee/Datasets/master/shampoo.csv "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!pip install statsmodels"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code snippet to create a joblib file :\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "from statsmodels.tsa.holtwinters import ExponentialSmoothing\n",
+ "import numpy as np\n",
+ "import joblib\n",
+ "\n",
+ "df=pd.read_csv('https://raw.githubusercontent.com/jbrownlee/Datasets/master/shampoo.csv')\n",
+ "\n",
+ "#Taking a test-train split of 80 %\n",
+ "train=df[0:int(len(df)*0.8)] \n",
+ "test=df[int(len(df)*0.8):]\n",
+ "\n",
+ "#Pre-processing the Month field\n",
+ "train.Timestamp = pd.to_datetime(train.Month,format='%m-%d') \n",
+ "train.index = train.Timestamp \n",
+ "test.Timestamp = pd.to_datetime(test.Month,format='%m-%d') \n",
+ "test.index = test.Timestamp \n",
+ "\n",
+ "#fitting the model based on optimal parameters\n",
+ "model = ExponentialSmoothing(np.asarray(train['Sales']) ,seasonal_periods=7 ,trend='add', seasonal='add',).fit()\n",
+ "joblib.dump(model,'model.sav')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### The Next step is to write the code in a format defined by s2i as given below :"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%writefile holt_winter.py\n",
+ "\n",
+ "import joblib\n",
+ "class holt_winter(object):\n",
+ " \"\"\"\n",
+ " Model template. You can load your model parameters in __init__ from a location accessible at runtime\n",
+ " \"\"\"\n",
+ " \n",
+ " def __init__(self):\n",
+ " \n",
+ " \"\"\"\n",
+ " Add any initialization parameters. These will be passed at runtime from the graph definition parameters defined in your seldondeployment kubernetes resource manifest.\n",
+ " \n",
+ " loading the joblib file \n",
+ " \"\"\"\n",
+ " self.model = joblib.load('model.sav')\n",
+ " print(\"Initializing ,inside constructor\")\n",
+ "\n",
+ "\n",
+ " def predict(self,X,feature_names):\n",
+ " \"\"\"\n",
+ " Return a prediction.\n",
+ " Parameters\n",
+ " ----------\n",
+ " X : array-like\n",
+ " feature_names : array of feature names (optional)\n",
+ " \n",
+ " This space can be used for data pre-processing as well\n",
+ " \"\"\"\n",
+ " print(X)\n",
+ " print(\"Predict called - will run idenity function\")\n",
+ " return self.model.forecast(X)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "After saving the code, we now create an environment_rest file and add the following lines: \n",
+ "\n",
+ "MODEL_NAME=holt_winter
\n",
+ "API_TYPE=REST
\n",
+ "SERVICE_TYPE=MODEL
\n",
+ "PERSISTENCE =0
\n",
+ "\n",
+ "\n",
+ "MODEL_NAME:
\n",
+ "The name of the class containing the model. Also the name of the python file which will be imported.
\n",
+ "\n",
+ "API_TYPE:
\n",
+ "API type to create. Can be REST or GRPC
\n",
+ "\n",
+ "SERVICE_TYPE:
\n",
+ "The service type being created. Available options are:
\n",
+ "1. MODEL
\n",
+ "2. ROUTER
\n",
+ "3. TRANSFORMER
\n",
+ "4. COMBINER
\n",
+ "5. OUTLIER_DETECTOR
\n",
+ "\n",
+ "\n",
+ "\n",
+ "PERSISTENCE:
\n",
+ "Set either to 0 or 1. Default is 0. If set to 1 then your model will be saved periodically to redis and loaded from redis (if exists) or created fresh if not.
\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%writefile requirements.txt\n",
+ "joblib\n",
+ "statsmodels\n",
+ "pandas\n",
+ "numpy\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%writefile environment_rest\n",
+ "\n",
+ "MODEL_NAME=holt_winter\n",
+ "API_TYPE=REST \n",
+ "SERVICE_TYPE=MODEL\n",
+ "PERSISTENCE =0\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Now we build the image using the s2i command, replace \"seldonio/statsmodel-holts:0.1\" with the image name of your choice :"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!s2i build -E environment_rest . seldonio/seldon-core-s2i-python3:0.18 seldonio/statsmodel-holts:0.1\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Running the docker image created:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!docker run --name \"holt_predictor\" -d --rm -p 5000:5000 seldonio/statsmodel-holts:0.1\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The code is now running at the local host at port 5000. It can be tested by sending a curl command, here we are sending a request to the model to predict the sales for the next 3 weeks."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!curl -s http://localhost:5000/predict -H \"Content-Type: application/json\" -d '{\"data\":{\"ndarray\":3}}'\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The next step is to push the code into the docker registry, you are free to use the docker hub or the private registry in your cluster. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!docker push seldonio/statsmodel-holts:0.1"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The final step is to deploy the configuration file on your cluster as shown below."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%writefile model.yaml\n",
+ "\n",
+ "apiVersion: machinelearning.seldon.io/v1alpha2\n",
+ "kind: SeldonDeployment\n",
+ "metadata:\n",
+ " name: holt-predictor\n",
+ "spec:\n",
+ " name: holt-predictor\n",
+ " predictors:\n",
+ " - componentSpecs:\n",
+ " - spec:\n",
+ " containers:\n",
+ " - image: seldonio/statsmodel-holts:0.1\n",
+ " imagePullPolicy: IfNotPresent\n",
+ " name: holt-predictor\n",
+ " graph:\n",
+ " children: []\n",
+ " endpoint:\n",
+ " type: REST\n",
+ " name: holt-predictor\n",
+ " type: MODEL\n",
+ " name: holt-predictor\n",
+ " replicas: 1"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!kubectl apply -f model.yaml\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Your model will now be deployed as a service, create a route in order for external traffic to access it . A sample curl request (with a dummy I.P, replace it with the route created by you) for the model is :"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!curl -s -d '{\"data\": {\"ndarray\":2}}' -X POST http://160.11.22.334:4556/seldon/testseldon/holt-predictor/api/v1.0/predictions -H \"Content-Type: application/json\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In the above command, we send a request to get a prediction of the sales of the shampoo for the next 2 days. testseldon is the namespace, you can replace it with the namespace created by you where the model is deployed .\n",
+ "\n",
+ "\n",
+ "The response we get is : \n",
+ "\n",
+ "{\"data\":{\"names\":[],\"ndarray\":[487.86681173,415.82743026 ]},\"meta\":{}}\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The data returned is an n-dimensional array with 2 values which is the predicted values by the model, in this case the sales of the shampoo."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Note: it is suggested that you try the model on your local system before deploying it on the cluster."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Model Monitoring"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Once the model is deployed, you can now monitor various metrics, the 2 main ones being:\n",
+ "\n",
+ "1. Requests per second
\n",
+ "2. Latency in serving the request\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "The model deployed on Seldon can be monitored using build in metrics dashboard on Grafana. Here is the link to deploy metrics dashboard: https://docs.seldon.io/projects/seldon-core/en/v1.1.0/analytics/analytics.html.
\n",
+ "The screenshot of a sample dashboard is given below:
\n",
+ "![dashboard_image1](dashboard_image.png)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Summary"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This documentation covers deploying time series model on Seldon, this model could be inferenced for forecasting values from a given data set. This is very useful for customers who want to deploy time series alogithm for forecasting models.\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.4"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}