-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code sync for upstream v0.14.0 #421
Code sync for upstream v0.14.0 #421
Conversation
* propagate trc bool across vllm init Signed-off-by: Calvin Woo <cwoo92@bloomberg.net> Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com> * use args directly to avoid undefined var Signed-off-by: Calvin Woo <cwoo92@bloomberg.net> Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com> * Remove trailing space Signed-off-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com> * move params to newline Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com> --------- Signed-off-by: Calvin Woo <cwoo92@bloomberg.net> Signed-off-by: calvin d. woo <calvin.d.woo@gmail.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
The KServe Python SDK README.md uses relative URLs that work well on GitHub but return a 404 error when visited on PyPI. This change updates the README.md to use absolute URLs that work well on both GitHub and PyPI. Signed-off-by: kevinbazira <kvnbzr@gmail.com>
check empty model final. Signed-off-by: HAO <howard31124@gmail.com> Co-authored-by: koshino17 <by900813@gmail.com>
* Fix No model ready error in multi model serving - Fixes the regression introduced by kserve#3275 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Mark transformer model ready in init method Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Initial implementation of inference client Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add tests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Use Inference client for e2e tests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Upgrade pytest-asyncio to 0.23.4 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix mutable object initialization in default parameters Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix graph e2e tests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix pmml test Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add explain, support dict response, use inference client for internal requests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix inference graph test and grpc headers Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Remove v1 datamodels Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Introduce protocol in client config Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Support inference graph Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> remove logging configs Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Update default timeout to 60 seconds Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add retry config for grpc client Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix infer model_name parameter Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add tests for graph endpoint Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> debug Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> fix http client param mismatch Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> skip graph test Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> fix timeout in grpc client Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix url construction Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix explain Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * configure logger for e2e tests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix grpc retry config Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Increase request timeout Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * configure logger for e2e tests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix grpc retry config Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Increase request timeout Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Use fixtures for rest client Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Fix model name not properly parsed by inference graph Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Handle single string arg with excess whitespace Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Handle duplicate arguments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
empty commit Signed-off-by: Spolti <fspolti@redhat.com>
Use add_generation_rompt for chat template Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
* Deduplicate the names for the additional domain names Signed-off-by: Vincent Hou <shou73@bloomberg.net> * Refactoring the functions Signed-off-by: Vincent Hou <shou73@bloomberg.net> --------- Signed-off-by: Vincent Hou <shou73@bloomberg.net>
virtual service case insensitive Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
* Install packages needed for model load Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * make all apt get into a single line Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…serve#3789) * Add readiness probe for mlserver in CI Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Increase memory limit for pmml test to prevent OOMKilled and read timeout error Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Fix logprobs Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix a scenario where stream completion fails if echo is true and logprobs is nil Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix a scenario where completion fails if the prompt is token_ids and echo is set to true Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Respect tokenizer revision Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add workaround for adding None to token_logprobs and top_logprobs Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
agent watcher unit test is always flaky so increase timeout to make it stable Signed-off-by: jooho lee <jlee@redhat.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Add tests for vLLM Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Uncomment tests for fixed bugs Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
….3 (kserve#3812) * Upgrade serving runtime python version to 3.11 and debian to bookworm Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Upgrade poetry to 1.8.3 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Upgrade openjdk to 17 for pmml Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix 'AS' casing warning Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix pmml server Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Bump vLLM to 0.5.3.post1 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Update makefile Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * approx probability comparison Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Set multiprocessing method to spawn Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…se 'spawn' for mutiprocessing (kserve#3757) * Refactor model server to let uvicorn handle multiple workers - Refactored the ModelServer to let uvicorn handle multiple workers. This will remove the bottleneck of using 'fork' for multiprocessing - Make FastAPI app instance easily accessible across the project so that users can easily add middlewares and custom exception handlers for custom models. - Use uvloop eventpolicy Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add middleware example Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add e2e test Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Remove nest_asyncio in art explainer Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Remove uvloop Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix python tests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * revert art explainer Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Remove monkeypatch Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Remove redundant future exception logging Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Spolti <fspolti@redhat.com>
* Make ray serve an optional dependency Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net> Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Unify the log configuration using kserve logger (kserve#3577) * Configure logging for serving runtimes Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add pyyaml dependency Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * black format Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * fix pyproject.toml Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * cleanup logger for e2e Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Modify logger format to include func name Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Log model download time. Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Allow disabling logger configuration and deprecate logger related arg in model server Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase master Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * pyyaml=^6.0.0 to fix build failure Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Remove logger related parameters from model server Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * import model_server Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix lint Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix linting Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net> Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase, minor fixes and add e2e test Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net> Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Co-authored-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* Update aif example chore: Update aif explainer example. - Bump KServer to 0.13.0, it will bring some library updates, plus, it fixes a few security alerts in this example. - update the scikit-learn package name Signed-off-by: Spolti <fspolti@redhat.com> * move the local instructions to the README Signed-off-by: Spolti <fspolti@redhat.com> * empty commit Signed-off-by: Spolti <fspolti@redhat.com> --------- Signed-off-by: Spolti <fspolti@redhat.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…ve#3737) These changes introduce the possibility to configure KServe with its own Istio local gateway, to partially decouple KServe from the Knative local gateway. Typically, it is OK to re-use the already configured Knative local gateway for KServe uses (as long as configs do not conflict). However, there are cases where having a dedicated local gateway for KServe is beneficial. Just to give some examples: * To have the ability to use strict mTLS in Istio * To reduce some pressure on the Knative local gateway by having a dedicated gateway deployment (it still would hit Knative gateway, but only once, rather than twice) * To be able to configure TLS on cluster-local hostnames (Knative support is still experimental) To have a dedicated Gateway in KServe, similar configurations to Knative are need to be done. At the very least, and if not having a dedicated gateway deployment, a v1/Service and an Istio Gateway resource need to be created for KServe. Such resources would need to be configured in _localGateway_ and _localGatewayService_. KServe still needs to rely on Knative routing for the KSVCs it creates. Thus, after handling an incoming request and resolving its target, it needs to be forwarded to be handled by Knative. This is the reason for introducing a new `knativeLocalGatewayService` in the ConfigMap. The removed `ingressService` seems to be unused. Apparently, it became unused when the v1alpa1 API of the InferenceServices was deprecated and removed. Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
* Add support for Azure DNS zone endpoints Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> * Add test cases for Azure Blob and File Share URI pattern matching Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> * flake8 Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> * black Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> --------- Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
* Add logging request feature for vLLM Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add log request feature for huggingface Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
* Consolidate into one commit Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Fix configmap format Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Fix configmap Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Log configmap read error Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * fix naming Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update comments Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Add enabled flag to configmap and avoid cluster resource check in isvc defaulter Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * move client into the local model block Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix lint Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> --------- Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* Sync helm chart with kustomize Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Update manifest generation script to sync helm charts Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Make kserve-addressable-resolver role optional Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Prepare for 0.14.0-rc1 release Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Update release process Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Comment out crd sync script in make Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix helm template syntax Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* add a new API for multi-node/multi-gpu Signed-off-by: jooho lee <jlee@redhat.com> * fix gitaction Signed-off-by: jooho lee <jlee@redhat.com> * fix merging conflict Signed-off-by: jooho lee <jlee@redhat.com> * fix gitaction fail Signed-off-by: jooho lee <jlee@redhat.com> * regenerate codegen/manifests Signed-off-by: jooho lee <jlee@redhat.com> * Apply suggestions from code review Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Jooho Lee <ljhiyh@gmail.com> * remove unnecessary comment Signed-off-by: jooho lee <jlee@redhat.com> * change the type of workerSpec in isvc to PodSpec Signed-off-by: jooho lee <jlee@redhat.com> * update controller-gen version Signed-off-by: jooho lee <jlee@redhat.com> * remove replicas from workerSpec Signed-off-by: jooho lee <jlee@redhat.com> * fix conflict merging Signed-off-by: jooho lee <jlee@redhat.com> * added size(replicas) for workerSpec again Signed-off-by: jooho lee <jlee@redhat.com> * add WorkerSpec to inferenceService Signed-off-by: jooho lee <jlee@redhat.com> * fix go linter Signed-off-by: jooho lee <jlee@redhat.com> --------- Signed-off-by: jooho lee <jlee@redhat.com> Signed-off-by: Jooho Lee <jlee@redhat.com> Signed-off-by: Jooho Lee <ljhiyh@gmail.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
…#3924) * fix openapigen.sh that can be executed from kserve dir Signed-off-by: jooho lee <jlee@redhat.com> * regenerate codegen/manifests Signed-off-by: jooho lee <jlee@redhat.com> * Update go.sum Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: jooho lee <jlee@redhat.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* Support python 3.12 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Update dependencies Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Update deps to support 3.12 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Remove python 3.8 support Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Remove skip for infer client test Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix port forward Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix sklearn pandas dep Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * skip pydantic v1 test for py 3.12 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add setuptools dep for pmml Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Fix lgb Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Include setuptools for paddle Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Include setuptools for huggingface Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Rebase Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Bump version to 0.13.0-rc0 (kserve#3665) Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net> Signed-off-by: jordanyono <jordanyono@gmail.com> * fixing docs Signed-off-by: jordanyono <jordanyono@gmail.com> * fix spelling mistake Signed-off-by: jordanyono <jordanyono@gmail.com> --------- Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net> Signed-off-by: jordanyono <jordanyono@gmail.com> Co-authored-by: Curtis Maddalozzo <cmaddalozzo@users.noreply.github.com>
* Fix local testing Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix codegen Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
* Add a flag for automount serviceaccount Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Set default to false Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Default to true Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Fix test error Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Update openapi generated.go Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Fix python lint Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Fix config loading Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> --------- Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
…ainer (kserve#3985) * Do not set security context on the storage initializer from user container Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * Add securityContext to the default storage container in the helm chart Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> --------- Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
This adds the model container as an init-container to mitigate a race condition that would happen if the model container is not present on the cluster-node. The race condition happens if the cluster is able to fetch and start the runtime container before the modelcar is fetched. This would lead to the runtime to terminate with error. By configuring the model container as an init-container the runtime won't start until the modelcar is fetched. Although there is still the risk of a race condition when the cluster schedules the runtime container first, the pod should stabilize after a few restarts of the runtime container and should either prevent a CrashLoopBackOff event on the pod, or the crash event would finish quickly. This improves compatibility with the runtimes which can now stay agnostic to the modelcar implementation, until better techniques (like native sidecars, and oci volume mounts) become mature. Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
* Initial commit for headers passing issue Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * modifying the e2e test for rebase conflict Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * bug fix on unittest Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * review changes Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * fix for test failure Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * bug fix on e2e test Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * overridding the entrypoint of custom model images Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * custom response header Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * fix for unittest failure Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * added custom response headers in post process Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * added predict time latency in example response header Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * fix OOM --------- Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* security update Signed-off-by: udai <udaij12@gmail.com> * adding sign off Signed-off-by: udai <udaij12@gmail.com> --------- Signed-off-by: udai <udaij12@gmail.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
* temp commit Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> * python-release.sh Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com> --------- Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
…14-upgrade Code sync with upstream, up to v0.14. Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
18c0f1b
to
b7a868f
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: israel-hdez, spolti The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@israel-hdez: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
No description provided.