added time-series algorithm

SeldonIO · Aug 28, 2020 · 507328a · 507328a
1 parent 4eb029e
commit 507328a
Show file tree

Hide file tree

Showing 4 changed files with 375 additions and 0 deletions.
diff --git a/doc/source/examples/statsmodels.nblink b/doc/source/examples/statsmodels.nblink
@@ -0,0 +1,3 @@
+{
+  "path": "../../../examples/models/statsmodels/statsmodels.ipynb"
+}
diff --git a/examples/models/statsmodels/dashboard.png b/examples/models/statsmodels/dashboard.png
diff --git a/examples/models/statsmodels/dshb3.png b/examples/models/statsmodels/dshb3.png
diff --git a/examples/models/statsmodels/statsmodels.ipynb b/examples/models/statsmodels/statsmodels.ipynb
@@ -0,0 +1,372 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#                  Deploying  Time-Series Models on Seldon  "
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "The following notebook are steps to deploy your first time-series model on Seldon. The first step is to install statsmodels on our local system, along with s2i. s2i will be used to convert the source code to a docker image and stasmodels is a python library to build statistical models.  \n",
+    "\n",
+    "Dependencies: \n",
+    "\n",
+    "-)S2I  (https://rb.gy/jgybo9)\n",
+    "\n",
+    "-)statsmodels (pip  install statsmodels) \n",
+    "\n",
+    "\n",
+    "\n",
+    "Assuming you have installed statsmodels and s2i,  the next step is to create a joblib file of your time-series model. The sample code is given below . Here  we have considered a Holt- Winter's  seasonal model and the shampoo sales dataset as a basic example.  \n",
+    " \n",
+    " \n",
+    "The univariate dataset : https://raw.githubusercontent.com/jbrownlee/Datasets/master/shampoo.csv "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##  Code snippet to create a joblib file :\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "from statsmodels.tsa.holtwinters import ExponentialSmoothing\n",
+    "import numpy as np\n",
+    "import joblib\n",
+    "\n",
+    "df=pd.read_csv('https://raw.githubusercontent.com/jbrownlee/Datasets/master/shampoo.csv')\n",
+    "\n",
+    "#Taking a test-train split of 80 %\n",
+    "train=df[0:int(len(df)*0.8)] \n",
+    "test=df[int(len(df)*0.8):]\n",
+    "\n",
+    "#Pre-processing the  Month  field\n",
+    "train.Timestamp = pd.to_datetime(train.Month,format='%m-%d') \n",
+    "train.index = train.Timestamp \n",
+    "test.Timestamp = pd.to_datetime(test.Month,format='%m-%d') \n",
+    "test.index = test.Timestamp \n",
+    "\n",
+    "#fitting the model based on  optimal parameters\n",
+    "model = ExponentialSmoothing(np.asarray(train['Sales']) ,seasonal_periods=7 ,trend='add', seasonal='add',).fit()\n",
+    "joblib.dump(model,'model.sav')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Next  step is to write the code in a format defined by s2i given below :"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import joblib\n",
+    "class holt_winter(object):\n",
+    "    \"\"\"\n",
+    "    Model template. You can load your model parameters in __init__ from a location accessible at runtime\n",
+    "    \"\"\"\n",
+    "    \n",
+    "    def __init__(self):\n",
+    "        \n",
+    "        \"\"\"\n",
+    "        Add any initialization parameters. These will be passed at runtime from the graph definition parameters defined in your seldondeployment kubernetes resource manifest.\n",
+    "        \n",
+    "        loading the joblib file \n",
+    "        \"\"\"\n",
+    "        self.model = joblib.load('model.sav')\n",
+    "        print(\"Initializing ,inside constructor\")\n",
+    "\n",
+    "\n",
+    "    def predict(self,X,feature_names):\n",
+    "        \"\"\"\n",
+    "        Return a prediction.\n",
+    "        Parameters\n",
+    "        ----------\n",
+    "        X : array-like\n",
+    "        feature_names : array of feature names (optional)\n",
+    "        \n",
+    "        This space can be used for data pre-processing as well\n",
+    "        \"\"\"\n",
+    "        print(X)\n",
+    "        print(\"Predict called - will run idenity function\")\n",
+    "        return self.model.forecast(X)"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "We now  create an environment_rest file and add  the following:                                                   \n",
+    "\n",
+    "MODEL_NAME=holt_winter_NN_SC\n",
+    "API_TYPE=REST\n",
+    "SERVICE_TYPE=MODEL\n",
+    "PERSISTENCE =0\n",
+    "\n",
+    "\n",
+    "MODEL_NAME:\n",
+    "The name of the class containing the model. Also the name of the python file which will be imported.\n",
+    "\n",
+    "API_TYPE:\n",
+    "API type to create. Can be REST or GRPC\n",
+    "\n",
+    "SERVICE_TYPE:\n",
+    "The service type being created. Available options are:\n",
+    "-)MODEL\n",
+    "-)ROUTER\n",
+    "-)TRANSFORMER\n",
+    "-)COMBINER\n",
+    "-)OUTLIER_DETECTOR\n",
+    "\n",
+    "\n",
+    "\n",
+    "PERSISTENCE:\n",
+    "Set either to 0 or 1. Default is 0. If set to 1 then your model will be saved periodically to redis and loaded from redis (if exists) or created fresh if not.\n",
+    "\n",
+    "\n",
+    "\n",
+    "Along with a requirements.txt file adding the libraries we use:\n",
+    "\n",
+    "\n",
+    "joblib\n",
+    "statsmodels\n",
+    "pandas\n",
+    "numpy\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we build the image using the s2i command, replace \"seldonio/statsmodel-holts:0.1\" with the image name of your choice :"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!s2i build -E environment_rest . seldonio/seldon-core-s2i-python3:0.18 seldonio/statsmodel-holts:0.1\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Running the docker image created:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!docker run --name \"holt_predictor\" -d --rm -p 5000:5000 seldonio/statsmodel-holts:0.1\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The code is now running at the local host at port 5000. It can be tested by sending a curl command, here we are sending a request to the model to predict the sales for the next 3 weeks."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!curl  -s http://localhost:5000/predict -H \"Content-Type: application/json\" -d '{\"data\":{\"ndarray\":3}}'\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The next step is to push the code into the docker registry, you are free to use the docker hub or the private registry in your cluster.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!docker push seldonio/statsmodel-holts:0.1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The final step is to deploy the configuration file on your cluster as shown below."
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "apiVersion: machinelearning.seldon.io/v1alpha2\n",
+    "kind: SeldonDeployment\n",
+    "metadata:\n",
+    "  name: holt-predictor\n",
+    "spec:\n",
+    "  name: holt-predictor\n",
+    "  predictors:\n",
+    "  - componentSpecs:\n",
+    "    - spec:\n",
+    "        containers:\n",
+    "        - image: seldonio/statsmodel-holts:0.1\n",
+    "          imagePullPolicy: IfNotPresent\n",
+    "          name: holt-predictor\n",
+    "    graph:\n",
+    "      children: []\n",
+    "      endpoint:\n",
+    "        type: REST\n",
+    "      name: holt-predictor\n",
+    "      type: MODEL\n",
+    "    name: holt-predictor\n",
+    "    replicas: 1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Your model will now be deployed as a service, create a route in order for external traffic to access it . A sample curl request (with a dummy I.P, replace it with the route created by you) for the model is ."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!curl -s -d '{\"data\": {\"ndarray\":2}}'    -X POST http://160.11.22.334:4556/seldon/testseldon/holt-predictor/api/v1.0/predictions    -H \"Content-Type: application/json\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the above command, we send a request to get a prediction of  the sales of the  shampoo for the next 2 days. testseldon is the namespace, you can replace it with the namespace created by you where the model is deployed .\n",
+    "\n",
+    "\n",
+    "The response we get is : \n",
+    "\n",
+    "{\"data\":{\"names\":[],\"ndarray\":[487.86681173,415.82743026 ]},\"meta\":{}}\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The data returned is an n-dimensional array with 2 values which is the predicted values by the model, in this case the sales of the shampoo."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<span style=\"color: red;\">Note: it is suggested that you try the model on your local system before deploying it on the cluster</span>."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Model Monitoring"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Once the model is deployed, you can now monitor various metrics, the 2 main ones being:\n",
+    "\n",
+    "1)Requests per second <br>\n",
+    "2)Latency in serving the request\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "Model monitoring: The model deployed on Seldon can be monitored using build in metrics dashboard on Grafana. Here is the link to deploy metrics dashboard: https://docs.seldon.io/projects/seldon-core/en/v1.1.0/analytics/analytics.html\n",
+    "\n",
+    "\n",
+    "Below is an image of the dashboard seldon provides out of the box:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![dashboard_image1](dashboard.png)\n",
+    "![dashboard_image1](dshb3.png)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Summary"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This documentation covers deploying time series model on Seldon, this model could be inferenced for forecasting values from a given data set. This is very useful for customers who want to deploy time series alogithm for forecasting models.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}