From afc53174ec304dbb6ecd3f3c06296558e031ab18 Mon Sep 17 00:00:00 2001
From: Narendranath  Nadig <35164475+narennadig@users.noreply.github.com>
Date: Mon, 31 Aug 2020 20:28:52 +0530
Subject: [PATCH] Add files via upload

---
 examples/models/statsmodels/statsmodels.ipynb | 402 ++++++++++++++++++
 1 file changed, 402 insertions(+)
 create mode 100644 examples/models/statsmodels/statsmodels.ipynb

diff --git a/examples/models/statsmodels/statsmodels.ipynb b/examples/models/statsmodels/statsmodels.ipynb
new file mode 100644
index 0000000000..597f6182a2
--- /dev/null
+++ b/examples/models/statsmodels/statsmodels.ipynb
@@ -0,0 +1,402 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "###                  Deploying  Time-Series Models on Seldon  "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The following notebook are steps to deploy your first time-series model on Seldon. The first step is to install statsmodels on our local system, along with s2i. s2i will be used to convert the source code to a docker image and stasmodels is a python library to build statistical models.  \n",
+    "\n",
+    "Dependencies:\n",
+    "\n",
+    "1. Seldon-core (https://docs.seldon.io/projects/seldon-core/en/v1.1.0/workflow/install.html) \n",
+    "\n",
+    "2. s2i - Source to Image (https://rb.gy/jgybo9)\n",
+    "\n",
+    "3. statsmodels (https://www.statsmodels.org/stable/index.html) \n",
+    "\n",
+    "\n",
+    "\n",
+    "Assuming you have installed statsmodels and s2i,  the next step is to create a joblib file of your time-series model. The sample code is given below . Here  we have considered a Holt- Winter's  seasonal model and the shampoo sales dataset as a basic example.  \n",
+    " \n",
+    " \n",
+    "The univariate dataset : https://raw.githubusercontent.com/jbrownlee/Datasets/master/shampoo.csv "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install statsmodels"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "###  Code snippet to create a joblib file :\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "from statsmodels.tsa.holtwinters import ExponentialSmoothing\n",
+    "import numpy as np\n",
+    "import joblib\n",
+    "\n",
+    "df=pd.read_csv('https://raw.githubusercontent.com/jbrownlee/Datasets/master/shampoo.csv')\n",
+    "\n",
+    "#Taking a test-train split of 80 %\n",
+    "train=df[0:int(len(df)*0.8)] \n",
+    "test=df[int(len(df)*0.8):]\n",
+    "\n",
+    "#Pre-processing the  Month  field\n",
+    "train.Timestamp = pd.to_datetime(train.Month,format='%m-%d') \n",
+    "train.index = train.Timestamp \n",
+    "test.Timestamp = pd.to_datetime(test.Month,format='%m-%d') \n",
+    "test.index = test.Timestamp \n",
+    "\n",
+    "#fitting the model based on  optimal parameters\n",
+    "model = ExponentialSmoothing(np.asarray(train['Sales']) ,seasonal_periods=7 ,trend='add', seasonal='add',).fit()\n",
+    "joblib.dump(model,'model.sav')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### The Next  step is to write the code in a format defined by s2i as given below :"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%writefile holt_winter.py\n",
+    "\n",
+    "import joblib\n",
+    "class holt_winter(object):\n",
+    "    \"\"\"\n",
+    "    Model template. You can load your model parameters in __init__ from a location accessible at runtime\n",
+    "    \"\"\"\n",
+    "    \n",
+    "    def __init__(self):\n",
+    "        \n",
+    "        \"\"\"\n",
+    "        Add any initialization parameters. These will be passed at runtime from the graph definition parameters defined in your seldondeployment kubernetes resource manifest.\n",
+    "        \n",
+    "        loading the joblib file \n",
+    "        \"\"\"\n",
+    "        self.model = joblib.load('model.sav')\n",
+    "        print(\"Initializing ,inside constructor\")\n",
+    "\n",
+    "\n",
+    "    def predict(self,X,feature_names):\n",
+    "        \"\"\"\n",
+    "        Return a prediction.\n",
+    "        Parameters\n",
+    "        ----------\n",
+    "        X : array-like\n",
+    "        feature_names : array of feature names (optional)\n",
+    "        \n",
+    "        This space can be used for data pre-processing as well\n",
+    "        \"\"\"\n",
+    "        print(X)\n",
+    "        print(\"Predict called - will run idenity function\")\n",
+    "        return self.model.forecast(X)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "After saving the code, we now  create an environment_rest file and add  the following lines:                                                   \n",
+    "\n",
+    "MODEL_NAME=holt_winter <br>\n",
+    "API_TYPE=REST <br>\n",
+    "SERVICE_TYPE=MODEL<br>\n",
+    "PERSISTENCE =0<br>\n",
+    "\n",
+    "\n",
+    "MODEL_NAME: <br>\n",
+    "The name of the class containing the model. Also the name of the python file which will be imported. <br>\n",
+    "\n",
+    "API_TYPE:<br>\n",
+    "API type to create. Can be REST or GRPC<br>\n",
+    "\n",
+    "SERVICE_TYPE:<br>\n",
+    "The service type being created. Available options are:<br>\n",
+    "1. MODEL<br>\n",
+    "2. ROUTER<br>\n",
+    "3. TRANSFORMER<br>\n",
+    "4. COMBINER<br>\n",
+    "5. OUTLIER_DETECTOR<br>\n",
+    "\n",
+    "\n",
+    "\n",
+    "PERSISTENCE:<br>\n",
+    "Set either to 0 or 1. Default is 0. If set to 1 then your model will be saved periodically to redis and loaded from redis (if exists) or created fresh if not. <br>\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%writefile requirements.txt\n",
+    "joblib\n",
+    "statsmodels\n",
+    "pandas\n",
+    "numpy\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%writefile environment_rest\n",
+    "\n",
+    "MODEL_NAME=holt_winter\n",
+    "API_TYPE=REST \n",
+    "SERVICE_TYPE=MODEL\n",
+    "PERSISTENCE =0\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we build the image using the s2i command, replace \"seldonio/statsmodel-holts:0.1\" with the image name of your choice :"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!s2i build -E environment_rest . seldonio/seldon-core-s2i-python3:0.18 seldonio/statsmodel-holts:0.1\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Running the docker image created:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!docker run --name \"holt_predictor\" -d --rm -p 5000:5000 seldonio/statsmodel-holts:0.1\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The code is now running at the local host at port 5000. It can be tested by sending a curl command, here we are sending a request to the model to predict the sales for the next 3 weeks."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!curl  -s http://localhost:5000/predict -H \"Content-Type: application/json\" -d '{\"data\":{\"ndarray\":3}}'\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The next step is to push the code into the docker registry, you are free to use the docker hub or the private registry in your cluster.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!docker push seldonio/statsmodel-holts:0.1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The final step is to deploy the configuration file on your cluster as shown below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%writefile model.yaml\n",
+    "\n",
+    "apiVersion: machinelearning.seldon.io/v1alpha2\n",
+    "kind: SeldonDeployment\n",
+    "metadata:\n",
+    "  name: holt-predictor\n",
+    "spec:\n",
+    "  name: holt-predictor\n",
+    "  predictors:\n",
+    "  - componentSpecs:\n",
+    "    - spec:\n",
+    "        containers:\n",
+    "        - image: seldonio/statsmodel-holts:0.1\n",
+    "          imagePullPolicy: IfNotPresent\n",
+    "          name: holt-predictor\n",
+    "    graph:\n",
+    "      children: []\n",
+    "      endpoint:\n",
+    "        type: REST\n",
+    "      name: holt-predictor\n",
+    "      type: MODEL\n",
+    "    name: holt-predictor\n",
+    "    replicas: 1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!kubectl apply -f model.yaml\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Your model will now be deployed as a service, create a route in order for external traffic to access it . A sample curl request (with a dummy I.P, replace it with the route created by you) for the model is :"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!curl -s -d '{\"data\": {\"ndarray\":2}}'    -X POST http://160.11.22.334:4556/seldon/testseldon/holt-predictor/api/v1.0/predictions    -H \"Content-Type: application/json\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the above command, we send a request to get a prediction of  the sales of the  shampoo for the next 2 days. testseldon is the namespace, you can replace it with the namespace created by you where the model is deployed .\n",
+    "\n",
+    "\n",
+    "The response we get is : \n",
+    "\n",
+    "{\"data\":{\"names\":[],\"ndarray\":[487.86681173,415.82743026 ]},\"meta\":{}}\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The data returned is an n-dimensional array with 2 values which is the predicted values by the model, in this case the sales of the shampoo."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<span style=\"color: red;\">Note: it is suggested that you try the model on your local system before deploying it on the cluster</span>."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Model Monitoring"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Once the model is deployed, you can now monitor various metrics, the 2 main ones being:\n",
+    "\n",
+    "1. Requests per second <br>\n",
+    "2. Latency in serving the request\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "The model deployed on Seldon can be monitored using build in metrics dashboard on Grafana. Here is the link to deploy metrics dashboard: https://docs.seldon.io/projects/seldon-core/en/v1.1.0/analytics/analytics.html.  <br>                                                                                                                                                                                                            \n",
+    "The screenshot of a sample dashboard is given below: <br>\n",
+    "![dashboard_image1](dashboard_image.png)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Summary"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This documentation covers deploying time series model on Seldon, this model could be inferenced for forecasting values from a given data set. This is very useful for customers who want to deploy time series alogithm for forecasting models.\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}