Skip to content

Commit

Permalink
sklearnserver/xgbserver version upgrade (kubeflow#1954)
Browse files Browse the repository at this point in the history
* sklearn/xgboost version upgrade

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add sklearn model tests

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add model missing exception

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update test model with new sklearn/xgboost version

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Set asyncio worker in transformer mode

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update alibi dependency

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update explainer model artifacts

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix explainer docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Load from model repository and return inference exception error

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add module coverage

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
  • Loading branch information
yuzisun authored Dec 22, 2021
1 parent ad557b0 commit dd4eeb7
Show file tree
Hide file tree
Showing 44 changed files with 155 additions and 111 deletions.
21 changes: 9 additions & 12 deletions .github/workflows/python-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,23 +37,20 @@ jobs:
python -m pip install --upgrade pip
- name: Test with pytest
run: |
pip install pytest
pip install pytest-cov
pip install --upgrade pytest-tornasync
cd python
pip install -e ./kserve
pytest ./kserve
pip install -e ./kserve[test]
pytest --cov=kserve ./kserve
pip install -e ./sklearnserver
pytest ./sklearnserver
pytest --cov=sklearnserver ./sklearnserver
pip install -e ./xgbserver
pytest ./xgbserver
pytest --cov=xgbserver ./xgbserver
pip install -e ./pytorchserver
pytest ./pytorchserver
pytest --cov=pytorchserver ./pytorchserver
pip install -e ./pmmlserver
pytest ./pmmlserver
pytest --cov=pmmlserver ./pmmlserver
pip install -e ./lgbserver
pytest ./lgbserver
pytest --cov=lgbserver ./lgbserver
pip install -e ./paddleserver[test]
pytest ./paddleserver
pytest --cov=paddleserver ./paddleserver
pip install -e ./alibiexplainer
pytest ./alibiexplainer
pytest --cov=alibiexplainer ./alibiexplainer
2 changes: 1 addition & 1 deletion config/runtimes/kserve-sklearnserver.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ metadata:
spec:
supportedModelFormats:
- name: sklearn
version: "0"
version: "1"
autoSelect: true
containers:
- name: kserve-container
Expand Down
2 changes: 1 addition & 1 deletion config/runtimes/kserve-xgbserver.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ metadata:
spec:
supportedModelFormats:
- name: xgboost
version: "0"
version: "1"
autoSelect: true
containers:
- name: kserve-container
Expand Down
8 changes: 2 additions & 6 deletions docs/samples/explanation/alibi/income/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# Example Anchors Tabular Explaination for Income Prediction

For users of KFServing v0.3.0 please follow [the README and notebook for v0.3.0 branch](https://github.com/kubeflow/kfserving/tree/v0.3.0/docs/samples/explanation/alibi/income).

This example uses a [US income dataset](https://archive.ics.uci.edu/ml/datasets/adult)

You can also try out the [Jupyter notebook](income_explanations.ipynb).
Expand All @@ -11,7 +9,7 @@ We can create a InferenceService with a trained sklearn predictor for this datas
The InferenceService is shown below:

```yaml
apiVersion: "serving.kserve.io/v1alpha2"
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "income"
Expand All @@ -30,15 +28,13 @@ spec:
minReplicas: 1
alibi:
type: AnchorTabular
storageUri: "gs://seldon-models/sklearn/income/explainer-py36-0.5.2"
storageUri: "gs://seldon-models/sklearn/income/explainer-py37-0.6.0"
resources:
requests:
cpu: 0.1
limits:
cpu: 1
```
For KFS 0.4 the explainer storageUri is `gs://seldon-models/sklearn/income/alibi/0.4.0`

Create this InferenceService:
```
Expand Down
16 changes: 10 additions & 6 deletions docs/samples/explanation/alibi/income/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,24 +16,28 @@
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from alibi.datasets import adult
from alibi.datasets import fetch_adult
import joblib
import dill
from sklearn.pipeline import Pipeline
import alibi

# load data
data, labels, feature_names, category_map = adult()
adult = fetch_adult()
data = adult.data
targets = adult.target
feature_names = adult.feature_names
category_map = adult.category_map

# define train and test set
np.random.seed(0)
data_perm = np.random.permutation(np.c_[data, labels])
data_perm = np.random.permutation(np.c_[data, targets])
data = data_perm[:, :-1]
labels = data_perm[:, -1]

idx = 30000
X_train, Y_train = data[:idx, :], labels[:idx]
X_test, Y_test = data[idx + 1:, :], labels[idx + 1:]
X_train, Y_train = data[:idx, :], targets[:idx]
X_test, Y_test = data[idx + 1:, :], targets[idx + 1:]

# feature transformation pipeline
ordinal_features = [x for x in range(len(feature_names)) if x not in list(category_map.keys())]
Expand All @@ -56,7 +60,7 @@
pipeline.fit(X_train, Y_train)

print("Creating an explainer")
explainer = alibi.explainers.AnchorTabular(predict_fn=lambda x: clf.predict(preprocessor.transform(x)),
explainer = alibi.explainers.AnchorTabular(predictor=lambda x: clf.predict(preprocessor.transform(x)),
feature_names=feature_names,
categorical_names=category_map)
explainer.fit(X_train)
Expand Down
2 changes: 1 addition & 1 deletion docs/samples/explanation/alibi/moviesentiment/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ We can create a InferenceService with a trained sklearn predictor for this datas
The InferenceService is shown below:

```
apiVersion: "serving.kserve.io/v1alpha2"
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "moviesentiment"
Expand Down
2 changes: 1 addition & 1 deletion docs/samples/explanation/alibi/moviesentiment/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
spacy_language_model = 'en_core_web_md'
spacy_model(model=spacy_language_model)
nlp = spacy.load(spacy_language_model)
anchors_text = AnchorText(nlp, lambda x: pipeline.predict(x))
anchors_text = AnchorText(nlp=nlp, predictor=lambda x: pipeline.predict(x))

# Test explanations locally
expl = anchors_text.explain("the actors are very bad")
Expand Down
13 changes: 2 additions & 11 deletions python/alibiexplainer/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,21 +32,12 @@
python_requires='>=3.6',
packages=find_packages("alibiexplainer"),
install_requires=[
"tensorflow==2.3.2",
"kserve>=0.7.0",
"pandas>=0.24.2",
"nest_asyncio>=1.4.0",
"alibi==0.6.0",
"scikit-learn == 0.20.3",
"argparse>=1.4.0",
"requests>=2.22.0",
"alibi==0.6.2",
"joblib>=0.13.2",
"dill>=0.3.0",
"grpcio>=1.22.0",
"xgboost==1.0.2",
"xgboost==1.5.0",
"shap==0.39.0",
"numpy<1.19.0",
'spacy[lookups]>=2.0.0, <4.0.0'
],
tests_require=tests_require,
extras_require={'test': tests_require}
Expand Down
6 changes: 3 additions & 3 deletions python/alibiexplainer/tests/test_anchor_tabular.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@
import json
from .utils import Predictor

ADULT_EXPLAINER_URI = "gs://seldon-models/sklearn/income/explainer-py37-0.6.0"
ADULT_MODEL_URI = "gs://seldon-models/sklearn/income/model"
ADULT_EXPLAINER_URI = "gs://kfserving-examples/models/sklearn/1.0/income/explainer-py37-0.6.2"
ADULT_MODEL_URI = "gs://kfserving-examples/models/sklearn/1.0/income/model"
EXPLAINER_FILENAME = "explainer.dill"


Expand All @@ -43,5 +43,5 @@ def test_anchor_tabular():
np.random.seed(0)
explanation = anchor_tabular.explain(X_test[0:1].tolist())
exp_json = json.loads(explanation.to_json())
assert exp_json["data"]["anchor"][0] == "Marital Status = Never-Married" or \
assert exp_json["data"]["anchor"][0] == "Relationship = Own-child" or \
exp_json["data"]["anchor"][0] == "Age <= 28.00"
4 changes: 2 additions & 2 deletions python/alibiexplainer/tests/test_anchor_text.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@
import json
import numpy as np

MOVIE_MODEL_URI = "gs://seldon-models/sklearn/moviesentiment"
MOVIE_MODEL_URI = "gs://kfserving-examples/models/sklearn/1.0/moviesentiment/model"


def test_anchor_text():
os.environ.clear()
skmodel = SKLearnModel("adult", MOVIE_MODEL_URI)
skmodel = SKLearnModel("movie", MOVIE_MODEL_URI)
skmodel.load()
predictor = Predictor(skmodel)
anchor_text = AnchorText(predictor.predict_fn, None)
Expand Down
2 changes: 2 additions & 0 deletions python/kserve/kserve/handlers/http.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ def write_error(self, status_code: int, **kwargs: Any) -> None:
if exc_info is not None:
if hasattr(exc_info[1], "reason"):
reason = exc_info[1].reason
else:
reason = str(exc_info[1])

self.write({"error": reason})

Expand Down
16 changes: 16 additions & 0 deletions python/kserve/kserve/kfmodel.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,22 @@ class PredictorProtocol(Enum):
GRPC_V2 = "grpc-v2"


class ModelMissingError(Exception):
def __init__(self, path):
self.path = path

def __str__(self):
return self.path


class InferenceError(RuntimeError):
def __init__(self, reason):
self.reason = reason

def __str__(self):
return self.reason


# KFModel is intended to be subclassed by various components within KFServing.
class KFModel:

Expand Down
12 changes: 11 additions & 1 deletion python/kserve/kserve/kfmodel_repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
from typing import Dict, Optional, Union
from kserve import KFModel
from ray.serve.api import RayServeHandle
import os

MODEL_MOUNT_DIRS = "/mnt/models"

Expand All @@ -29,6 +30,12 @@ def __init__(self, models_dir: str = MODEL_MOUNT_DIRS):
self.models = {}
self.models_dir = models_dir

def load_models(self):
for name in os.listdir(self.models_dir):
d = os.path.join(self.models_dir, name)
if os.path.isdir(d):
self.load_model(name)

def set_models_dir(self, models_dir): # used for unit tests
self.models_dir = models_dir

Expand All @@ -45,7 +52,7 @@ def is_model_ready(self, name: str):
if isinstance(model, KFModel):
return False if model is None else model.ready
else:
# For Ray Serve, the models are guaranteed to be ready after the deploy call.
# For Ray Serve, the models are guaranteed to be ready after deploying the model.
return True

def update(self, model: KFModel):
Expand All @@ -57,6 +64,9 @@ def update_handle(self, name: str, model_handle: RayServeHandle):
def load(self, name: str) -> bool:
pass

def load_model(self, name: str) -> bool:
pass

def unload(self, name: str):
if name in self.models:
del self.models[name]
Expand Down
1 change: 1 addition & 0 deletions python/kserve/kserve/kfserver.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
help='The number of works to fork')
parser.add_argument('--max_asyncio_workers', default=None, type=int,
help='Max number of asyncio workers to spawn')

args, _ = parser.parse_known_args()

tornado.log.enable_pretty_logging()
Expand Down
4 changes: 2 additions & 2 deletions python/kserve/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ minio>=4.0.9,<7.0.0
google-cloud-storage==1.41.1
adal>=1.2.2
table_logger>=0.3.5
numpy>=1.17.3
numpy~=1.19.2
azure-storage-blob==12.8.1
azure-identity>=1.6.0
cloudevents>=1.2.0
Expand All @@ -19,7 +19,7 @@ boto3==1.18.18
botocore==1.21.18
psutil>=5.0
ray[serve]==1.9.0
grpcio==1.38.1
grpcio>=1.34.0
google_api_core==1.29.0
jmespath==0.10.0
googleapis_common_protos==1.53.0
Expand Down
2 changes: 2 additions & 0 deletions python/kserve/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

TESTS_REQUIRES = [
'pytest',
'pytest-cov',
'pytest-asyncio',
'pytest-tornasync',
'mypy'
]
Expand Down
10 changes: 4 additions & 6 deletions python/lgbserver/lgbserver/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,9 @@

import argparse
import logging
import sys
import kserve


from kserve.kfmodel import ModelMissingError
from lgbserver.lightgbm_model_repository import LightGBMModelRepository
from lgbserver.model import LightGBMModel

Expand All @@ -39,10 +38,9 @@
model = LightGBMModel(args.model_name, args.model_dir, args.nthread)
try:
model.load()
except Exception:
ex_type, ex_value = sys.exc_info()[:2]
logging.error(f"fail to load model {args.model_name} from dir {args.model_dir}. "
f"exception type {ex_type}, exception msg: {ex_value}")
except ModelMissingError:
logging.error(f"fail to load model {args.model_name} from dir {args.model_dir},"
f"trying to load from model repository.")
model_repository = LightGBMModelRepository(args.model_dir, args.nthread)
# LightGBM doesn't support multi-process, so the number of http server workers should be 1.
kfserver = kserve.KFServer(workers=1, registered_models=model_repository) # pylint:disable=c-extension-no-member
Expand Down
4 changes: 4 additions & 0 deletions python/lgbserver/lgbserver/lightgbm_model_repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,12 @@ class LightGBMModelRepository(KFModelRepository):
def __init__(self, model_dir: str = MODEL_MOUNT_DIRS, nthread: int = 1):
super().__init__(model_dir)
self.nthread = nthread
self.load_models()

async def load(self, name: str) -> bool:
return self.load_model(name)

def load_model(self, name: str) -> bool:
model = LightGBMModel(name, os.path.join(self.models_dir, name), self.nthread)
if model.load():
self.update(model)
Expand Down
6 changes: 5 additions & 1 deletion python/lgbserver/lgbserver/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
import os
from typing import Dict
import pandas as pd
from kserve.kfmodel import ModelMissingError, InferenceError


BOOSTER_FILE = "model.bst"

Expand All @@ -36,6 +38,8 @@ def __init__(self, name: str, model_dir: str, nthread: int,
def load(self) -> bool:
model_file = os.path.join(
kserve.Storage.download(self.model_dir), BOOSTER_FILE)
if not os.path.exists(model_file):
raise ModelMissingError(model_file)
self._booster = lgb.Booster(params={"nthread": self.nthread},
model_file=model_file)
self.ready = True
Expand All @@ -51,4 +55,4 @@ def predict(self, request: Dict) -> Dict:
result = self._booster.predict(inputs)
return {"predictions": result.tolist()}
except Exception as e:
raise Exception("Failed to predict %s" % e)
raise InferenceError(str(e))
5 changes: 2 additions & 3 deletions python/lgbserver/lgbserver/test_lightgbm_model_repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,8 @@ async def test_load():

@pytest.mark.asyncio
async def test_load_fail():
repo = LightGBMModelRepository(model_dir=model_dir, nthread=1)
model_name = "model"
with pytest.raises(Exception):
await repo.load(model_name)
repo = LightGBMModelRepository(model_dir=invalid_model_dir, nthread=1)
model_name = "model"
assert repo.get_model(model_name) is None
assert not repo.is_model_ready(model_name)
2 changes: 1 addition & 1 deletion python/sklearnserver/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
packages=find_packages("sklearnserver"),
install_requires=[
"kserve>=0.7.0",
"scikit-learn == 0.20.3",
"scikit-learn == 1.0.1",
"joblib >= 0.13.0"
],
tests_require=tests_require,
Expand Down
Loading

0 comments on commit dd4eeb7

Please sign in to comment.