Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code sync for upstream v0.14.0 #421

Merged
merged 95 commits into from
Oct 28, 2024

Conversation

israel-hdez
Copy link

No description provided.

calwoo and others added 30 commits June 21, 2024 06:18
* propagate trc bool across vllm init

Signed-off-by: Calvin Woo <cwoo92@bloomberg.net>
Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com>

* use args directly to avoid undefined var

Signed-off-by: Calvin Woo <cwoo92@bloomberg.net>
Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com>

* Remove trailing space

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com>

* move params to newline

Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com>

---------

Signed-off-by: Calvin Woo <cwoo92@bloomberg.net>
Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
The KServe Python SDK README.md uses relative URLs that work well on GitHub but return a 404 error when visited on PyPI.

This change updates the README.md to use absolute URLs that work well on both GitHub and PyPI.

Signed-off-by: kevinbazira <kvnbzr@gmail.com>
check empty model final.

Signed-off-by: HAO <howard31124@gmail.com>
Co-authored-by: koshino17 <by900813@gmail.com>
* Fix No model ready error in multi model serving

- Fixes the regression introduced by kserve#3275

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Mark transformer model ready in init method

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Initial implementation of inference client

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add tests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Use Inference client for e2e tests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Upgrade pytest-asyncio to 0.23.4

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix mutable object initialization in default parameters

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix graph e2e tests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix pmml test

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add explain, support dict response, use inference client for internal requests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix inference graph test and grpc headers

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Remove v1 datamodels

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Introduce protocol in client config

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Support inference graph

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

remove logging configs

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Update default timeout to 60 seconds

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add retry config for grpc client

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix infer model_name parameter

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add tests for graph endpoint

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

debug

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

fix http client param mismatch

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

skip graph test

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

fix timeout in grpc client

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix url construction

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix explain

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* configure logger for e2e tests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Rebase master

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Rebase master

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix grpc retry config

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Increase request timeout

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* configure logger for e2e tests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Rebase master

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Rebase master

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix grpc retry config

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Increase request timeout

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Rebase

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Use fixtures for rest client

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Fix model name not properly parsed by inference graph

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Handle single string arg with excess whitespace

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Handle duplicate arguments

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
empty commit

Signed-off-by: Spolti <fspolti@redhat.com>
Use add_generation_rompt for chat template

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
* Deduplicate the names for the additional domain names

Signed-off-by: Vincent Hou <shou73@bloomberg.net>

* Refactoring the functions

Signed-off-by: Vincent Hou <shou73@bloomberg.net>

---------

Signed-off-by: Vincent Hou <shou73@bloomberg.net>
virtual service case insensitive

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
* Install packages needed for model load

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* make all apt get into a single line

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…serve#3789)

* Add readiness probe for mlserver in CI

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Increase memory limit for pmml test to prevent OOMKilled and read timeout error

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Fix logprobs

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix a scenario where stream completion fails if echo is true and logprobs is nil

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix a scenario where completion fails if the prompt is token_ids and echo is set to true

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Respect tokenizer revision

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add workaround for adding None to token_logprobs and top_logprobs

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
agent watcher unit test is always flaky so increase timeout to make it stable

Signed-off-by: jooho lee <jlee@redhat.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Add tests for vLLM

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* resolve comments

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Uncomment tests for fixed bugs

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
….3 (kserve#3812)

* Upgrade serving runtime python version to 3.11 and debian to bookworm

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Upgrade poetry to 1.8.3

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Upgrade openjdk to 17 for pmml

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix 'AS' casing warning

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix pmml server

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Bump vLLM to 0.5.3.post1

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update makefile

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* approx probability comparison

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Set multiprocessing method to spawn

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…se 'spawn' for mutiprocessing (kserve#3757)

* Refactor model server to let uvicorn handle multiple workers

- Refactored the ModelServer to let uvicorn handle multiple workers. This will remove the bottleneck of using 'fork' for multiprocessing

- Make FastAPI app instance easily accessible across the project so that users can easily add middlewares and custom exception handlers for custom models.

- Use uvloop eventpolicy

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add middleware example

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add e2e test

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Remove nest_asyncio in art explainer

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Remove uvloop

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Rebase master

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix python tests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* revert art explainer

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Remove monkeypatch

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Remove redundant future exception logging

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Spolti <fspolti@redhat.com>
* Make ray serve an optional dependency

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Unify the log configuration using kserve logger (kserve#3577)

* Configure logging for serving runtimes

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add pyyaml dependency

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* black format

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* fix pyproject.toml

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Rebase master

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* cleanup logger for e2e

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Modify logger format to include func name

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Log model download time.

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Rebase master

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Allow disabling logger configuration and deprecate logger related arg in model server

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Rebase master

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Resolve comments

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* pyyaml=^6.0.0 to fix build failure

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Remove logger related parameters from model server

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* import model_server

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix lint

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix linting

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Rebase, minor fixes and add e2e test

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Co-authored-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* Update aif example

chore:	Update aif explainer example.
	- Bump KServer to 0.13.0, it will bring some library updates, plus, it fixes a few security alerts in this example.
	- update the scikit-learn package name

Signed-off-by: Spolti <fspolti@redhat.com>

* move the local instructions to the README

Signed-off-by: Spolti <fspolti@redhat.com>

* empty commit

Signed-off-by: Spolti <fspolti@redhat.com>

---------

Signed-off-by: Spolti <fspolti@redhat.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…ve#3737)

These changes introduce the possibility to configure KServe with its own Istio local gateway, to partially decouple KServe from the Knative local gateway.

Typically, it is OK to re-use the already configured Knative local gateway for KServe uses (as long as configs do not conflict). However, there are cases where having a dedicated local gateway for KServe is beneficial. Just to give some examples:
* To have the ability to use strict mTLS in Istio
* To reduce some pressure on the Knative local gateway by having a dedicated gateway deployment (it still would hit Knative gateway, but only once, rather than twice)
* To be able to configure TLS on cluster-local hostnames (Knative support is still experimental)

To have a dedicated Gateway in KServe, similar configurations to Knative are need to be done. At the very least, and if not having a dedicated gateway deployment, a v1/Service and an Istio Gateway resource need to be created for KServe. Such resources would need to be configured in _localGateway_ and _localGatewayService_. KServe still needs to rely on Knative routing for the KSVCs it creates. Thus, after handling an incoming request and resolving its target, it needs to be forwarded to be handled by Knative. This is the reason for introducing a new `knativeLocalGatewayService` in the ConfigMap.

The removed `ingressService` seems to be unused. Apparently, it became unused when the v1alpa1 API of the InferenceServices was deprecated and removed.

Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
* Add support for Azure DNS zone endpoints

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Add test cases for Azure Blob and File Share URI pattern matching

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* flake8

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* black

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

---------

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
* Add logging request feature for vLLM

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add log request feature for huggingface

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
yuzisun and others added 17 commits September 30, 2024 08:20
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
* Consolidate into one commit

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix configmap format

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix configmap

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Log configmap read error

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* fix naming

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update comments

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Add enabled flag to configmap and avoid cluster resource check in isvc defaulter

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* move client into the local model block

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix lint

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

---------

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* Sync helm chart with kustomize

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update manifest generation script to sync helm charts

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Make kserve-addressable-resolver role optional

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Prepare for 0.14.0-rc1 release

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update release process

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Comment out crd sync script in make

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix helm template syntax

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* add a new API for multi-node/multi-gpu

Signed-off-by: jooho lee <jlee@redhat.com>

* fix gitaction

Signed-off-by: jooho lee <jlee@redhat.com>

* fix merging conflict

Signed-off-by: jooho lee <jlee@redhat.com>

* fix gitaction fail

Signed-off-by: jooho lee <jlee@redhat.com>

* regenerate codegen/manifests

Signed-off-by: jooho lee <jlee@redhat.com>

* Apply suggestions from code review

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Jooho Lee <ljhiyh@gmail.com>

* remove unnecessary comment

Signed-off-by: jooho lee <jlee@redhat.com>

* change the type of workerSpec in isvc to PodSpec

Signed-off-by: jooho lee <jlee@redhat.com>

* update controller-gen version

Signed-off-by: jooho lee <jlee@redhat.com>

* remove replicas from workerSpec

Signed-off-by: jooho lee <jlee@redhat.com>

* fix conflict merging

Signed-off-by: jooho lee <jlee@redhat.com>

* added size(replicas) for workerSpec again

Signed-off-by: jooho lee <jlee@redhat.com>

* add WorkerSpec to inferenceService

Signed-off-by: jooho lee <jlee@redhat.com>

* fix go linter

Signed-off-by: jooho lee <jlee@redhat.com>

---------

Signed-off-by: jooho lee <jlee@redhat.com>
Signed-off-by: Jooho Lee <jlee@redhat.com>
Signed-off-by: Jooho Lee <ljhiyh@gmail.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
…#3924)

* fix openapigen.sh that can be executed from kserve dir

Signed-off-by: jooho lee <jlee@redhat.com>

* regenerate codegen/manifests

Signed-off-by: jooho lee <jlee@redhat.com>

* Update go.sum

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: jooho lee <jlee@redhat.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* Support python 3.12

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update dependencies

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update deps to support 3.12

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Remove python 3.8 support

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Remove skip for infer client test

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix port forward

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix sklearn pandas dep

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* skip pydantic v1 test for py 3.12

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add setuptools dep for pmml

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Fix lgb

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Include setuptools for paddle

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Include setuptools for huggingface

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Rebase

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Rebase

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Bump version to 0.13.0-rc0 (kserve#3665)

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Signed-off-by: jordanyono <jordanyono@gmail.com>

* fixing docs

Signed-off-by: jordanyono <jordanyono@gmail.com>

* fix spelling mistake

Signed-off-by: jordanyono <jordanyono@gmail.com>

---------

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Signed-off-by: jordanyono <jordanyono@gmail.com>
Co-authored-by: Curtis Maddalozzo <cmaddalozzo@users.noreply.github.com>
* Fix local testing

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix codegen

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Add a flag for automount serviceaccount

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Set default to false

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Default to true

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix test error

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Update openapi generated.go

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix python lint

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix config loading

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

---------

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
…ainer (kserve#3985)

* Do not set security context on the storage initializer from user container

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Add securityContext to the default storage container in the helm chart

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

---------

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
This adds the model container as an init-container to mitigate a race
condition that would happen if the model container is not present on the
cluster-node. The race condition happens if the cluster is able to fetch
and start the runtime container before the modelcar is fetched. This
would lead to the runtime to terminate with error.

By configuring the model container as an init-container the runtime
won't start until the modelcar is fetched. Although there is still the
risk of a race condition when the cluster schedules the runtime
container first, the pod should stabilize after a few restarts of the
runtime container and should either prevent a CrashLoopBackOff event on
the pod, or the crash event would finish quickly.

This improves compatibility with the runtimes which can now stay
agnostic to the modelcar implementation, until better techniques (like
native sidecars, and oci volume mounts) become mature.

Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
* Initial commit for headers passing issue

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* modifying the e2e test for rebase conflict

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* bug fix on unittest

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* review changes

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* fix for test failure

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* bug fix on e2e test

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* overridding the entrypoint of custom model images

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* custom response header

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* fix for unittest failure

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* added custom response headers in post process

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* added predict time latency in example response header

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* fix OOM

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* security update

Signed-off-by: udai <udaij12@gmail.com>

* adding sign off

Signed-off-by: udai <udaij12@gmail.com>

---------

Signed-off-by: udai <udaij12@gmail.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
* temp commit

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* python-release.sh

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

---------

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
@openshift-ci openshift-ci bot requested review from Jooho and spolti October 22, 2024 21:00
…14-upgrade

Code sync with upstream, up to v0.14.

Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
@israel-hdez israel-hdez force-pushed the j9436-kserve014-upgrade branch from 18c0f1b to b7a868f Compare October 25, 2024 17:40
Copy link

openshift-ci bot commented Oct 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: israel-hdez, spolti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

openshift-ci bot commented Oct 28, 2024

@israel-hdez: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-raw c9453bf link true /test e2e-raw
ci/prow/e2e-slow c9453bf link true /test e2e-slow
ci/prow/e2e-fast c9453bf link true /test e2e-fast

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@israel-hdez israel-hdez merged commit 733c1c3 into opendatahub-io:master Oct 28, 2024
24 of 29 checks passed
@israel-hdez israel-hdez deleted the j9436-kserve014-upgrade branch October 28, 2024 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.