From 16c0b683f15b72c775b21da70f40129877d023ba Mon Sep 17 00:00:00 2001 From: Berto D'Attoma <88311595+bdattoma@users.noreply.github.com> Date: Thu, 17 Oct 2024 14:32:38 +0200 Subject: [PATCH] Add GPU specific Tags (#1938) * add nvidia tag for all the tests with Resources-GPU tag * add amd gpu tag in vllm tests * remove wrong comment --- .../0101__post_install.robot | 2 +- ...0__ods_dashboard_projects_additional.robot | 4 +- .../autoscaling-gpus.robot | 2 +- .../minimal-cuda-test.robot | 14 +++--- .../minimal-pytorch-test.robot | 10 ++--- .../minimal-tensorflow-test.robot | 10 ++--- .../0501__ide_jupyterhub/multiple-gpus.robot | 4 +- .../test-run-tuning-stack-tests.robot | 2 +- ...test-run-distributed-workloads-tests.robot | 6 +-- .../1002__model_serving_modelmesh_gpu.robot | 4 +- .../1005__model_serving_ovms_on_kserve.robot | 2 +- .../1007__model_serving_llm.robot | 2 +- .../1007__model_serving_llm_UI.robot | 2 +- .../1007__model_serving_llm_models.robot | 44 +++++++++---------- ..._model_serving_llm_other_runtimes_UI.robot | 6 +-- .../1007__model_serving_llm_tgis.robot | 2 +- .../1008__model_serving_vllm_metrics.robot | 6 +-- 17 files changed, 61 insertions(+), 61 deletions(-) diff --git a/ods_ci/tests/Tests/0100__platform/0101__deploy/0101__installation/0101__post_install.robot b/ods_ci/tests/Tests/0100__platform/0101__deploy/0101__installation/0101__post_install.robot index b61574d34..b9db7373a 100644 --- a/ods_ci/tests/Tests/0100__platform/0101__deploy/0101__installation/0101__post_install.robot +++ b/ods_ci/tests/Tests/0100__platform/0101__deploy/0101__installation/0101__post_install.robot @@ -53,7 +53,7 @@ Verify Notebook Controller Deployment Verify GPU Operator Deployment # robocop: disable [Documentation] Verifies Nvidia GPU Operator is correctly installed [Tags] Sanity Tier1 - ... Resources-GPU # Not actually needed, but we first need to enable operator install by default + ... Resources-GPU NVIDIA-GPUs # Not actually needed, but we first need to enable operator install by default ... ODS-1157 # Before GPU Node is added to the cluster diff --git a/ods_ci/tests/Tests/0400__ods_dashboard/0410__ods_dashboard_projects/0410__ods_dashboard_projects_additional.robot b/ods_ci/tests/Tests/0400__ods_dashboard/0410__ods_dashboard_projects/0410__ods_dashboard_projects_additional.robot index 09c6a4081..f5417a33a 100644 --- a/ods_ci/tests/Tests/0400__ods_dashboard/0410__ods_dashboard_projects/0410__ods_dashboard_projects_additional.robot +++ b/ods_ci/tests/Tests/0400__ods_dashboard/0410__ods_dashboard_projects/0410__ods_dashboard_projects_additional.robot @@ -84,7 +84,7 @@ Verify Notebook Tolerations Are Applied To Workbenches Verify User Can Add GPUs To Workbench [Documentation] Verifies user can add GPUs to an already started workbench [Tags] Tier1 Sanity - ... ODS-2013 Resources-GPU + ... ODS-2013 Resources-GPU NVIDIA-GPUs Launch Data Science Project Main Page Create Workbench workbench_title=${WORKBENCH_TITLE_GPU} workbench_description=${EMPTY} ... prj_title=${PRJ_TITLE} image_name=${NB_IMAGE_GPU} deployment_size=Small @@ -108,7 +108,7 @@ Verify User Can Add GPUs To Workbench Verify User Can Remove GPUs From Workbench [Documentation] Verifies user can remove GPUs from an already started workbench [Tags] Tier1 Sanity - ... ODS-2014 Resources-GPU + ... ODS-2014 Resources-GPU NVIDIA-GPUs Launch Data Science Project Main Page Create Workbench workbench_title=${WORKBENCH_TITLE_GPU} workbench_description=${EMPTY} ... prj_title=${PRJ_TITLE} image_name=${NB_IMAGE_GPU} deployment_size=Small diff --git a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/autoscaling-gpus.robot b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/autoscaling-gpus.robot index 8d069ef36..0d1009354 100644 --- a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/autoscaling-gpus.robot +++ b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/autoscaling-gpus.robot @@ -11,7 +11,7 @@ Resource ../../../Resources/Page/OCPDashboard/Pods/Pods.robot Library JupyterLibrary Suite Setup Spawner Suite Setup Suite Teardown End Web Test -Test Tags Resources-GPU +Test Tags Resources-GPU NVIDIA-GPUs *** Variables *** diff --git a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-cuda-test.robot b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-cuda-test.robot index f7ce32608..50a6f1475 100644 --- a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-cuda-test.robot +++ b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-cuda-test.robot @@ -22,35 +22,35 @@ Verify CUDA Image Can Be Spawned With GPU [Documentation] Spawns CUDA image with 1 GPU and verifies that the GPU is ... not available for other users. [Tags] Sanity - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1141 ODS-346 ODS-1359 Pass Execution Passing tests, as suite setup ensures that image can be spawned Verify CUDA Image Includes Expected CUDA Version [Documentation] Checks CUDA version [Tags] Sanity - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1142 Verify Installed CUDA Version ${EXPECTED_CUDA_VERSION} Verify PyTorch Library Can See GPUs In Minimal CUDA [Documentation] Installs PyTorch and verifies it can see the GPU [Tags] Sanity - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1144 Verify Pytorch Can See GPU install=True Verify Tensorflow Library Can See GPUs In Minimal CUDA [Documentation] Installs Tensorflow and verifies it can see the GPU [Tags] Sanity - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1143 Verify Tensorflow Can See GPU install=True Verify Cuda Image Has NVCC Installed [Documentation] Verifies NVCC Version in Minimal CUDA Image [Tags] Sanity - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-483 ${nvcc_version} = Run Cell And Get Output input=!nvcc --version Should Not Contain ${nvcc_version} /usr/bin/sh: nvcc: command not found @@ -58,7 +58,7 @@ Verify Cuda Image Has NVCC Installed Verify Previous CUDA Notebook Image With GPU [Documentation] Runs a workload after spawning the N-1 CUDA Notebook [Tags] Tier2 LiveTesting - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-2128 [Setup] N-1 CUDA Setup Spawn Notebook With Arguments image=${NOTEBOOK_IMAGE} size=Small gpus=1 version=previous @@ -90,7 +90,7 @@ Verify CUDA Image Suite Setup # This will fail in case there are two nodes with the same number of GPUs # Since the overall available number won't change even after 1 GPU is assigned # However I can't think of a better way to execute this check, under the assumption that - # the Resources-GPU tag will always ensure there is 1 node with 1 GPU on the cluster. + # the Resources-GPU will always ensure there is 1 node with 1 GPU on the cluster. ${maxNo} = Find Max Number Of GPUs In One Node ${maxSpawner} = Fetch Max Number Of GPUs In Spawner Page # Need to continue execution even on failure or the whole suite will be failed diff --git a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-pytorch-test.robot b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-pytorch-test.robot index cd9816fd4..b72455e7f 100644 --- a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-pytorch-test.robot +++ b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-pytorch-test.robot @@ -49,7 +49,7 @@ Verify Tensorboard Is Accessible Verify PyTorch Image Can Be Spawned With GPU [Documentation] Spawns PyTorch image with 1 GPU [Tags] Tier1 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1145 Clean Up Server Stop JupyterLab Notebook Server @@ -60,28 +60,28 @@ Verify PyTorch Image Can Be Spawned With GPU Verify PyTorch Image Includes Expected CUDA Version [Documentation] Checks CUDA version [Tags] Tier1 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1146 Verify Installed CUDA Version ${EXPECTED_CUDA_VERSION} Verify PyTorch Library Can See GPUs In PyTorch Image [Documentation] Verifies PyTorch can see the GPU [Tags] Tier1 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1147 Verify Pytorch Can See GPU Verify PyTorch Image GPU Workload [Documentation] Runs a workload on GPUs in PyTorch image [Tags] Tier1 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1148 Run Repo And Clean https://github.com/lugi0/notebook-benchmarks notebook-benchmarks/pytorch/fgsm_tutorial.ipynb Verify Previous PyTorch Notebook Image With GPU [Documentation] Runs a workload after spawning the N-1 PyTorch Notebook [Tags] Tier2 LiveTesting - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-2129 [Setup] N-1 PyTorch Setup Spawn Notebook With Arguments image=${NOTEBOOK_IMAGE} size=Small gpus=1 version=previous diff --git a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-tensorflow-test.robot b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-tensorflow-test.robot index 714283fd7..da4f6e260 100644 --- a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-tensorflow-test.robot +++ b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/minimal-tensorflow-test.robot @@ -50,7 +50,7 @@ Verify Tensorboard Is Accessible Verify Tensorflow Image Can Be Spawned With GPU [Documentation] Spawns PyTorch image with 1 GPU [Tags] Tier1 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1151 Close Previous Server Spawn Notebook With Arguments image=${NOTEBOOK_IMAGE} size=Small gpus=1 @@ -58,28 +58,28 @@ Verify Tensorflow Image Can Be Spawned With GPU Verify Tensorflow Image Includes Expected CUDA Version [Documentation] Checks CUDA version [Tags] Tier1 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1152 Verify Installed CUDA Version ${EXPECTED_CUDA_VERSION} Verify Tensorflow Library Can See GPUs In Tensorflow Image [Documentation] Verifies Tensorlow can see the GPU [Tags] Tier1 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1153 Verify Tensorflow Can See GPU Verify Tensorflow Image GPU Workload [Documentation] Runs a workload on GPUs in Tensorflow image [Tags] Tier1 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-1154 Run Repo And Clean https://github.com/lugi0/notebook-benchmarks notebook-benchmarks/tensorflow/GPU-no-warnings.ipynb Verify Previous Tensorflow Notebook Image With GPU [Documentation] Runs a workload after spawning the N-1 Tensorflow Notebook [Tags] Tier2 LiveTesting - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... ODS-2130 [Setup] N-1 Tensorflow Setup Spawn Notebook With Arguments image=${NOTEBOOK_IMAGE} size=Small gpus=1 version=previous diff --git a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/multiple-gpus.robot b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/multiple-gpus.robot index ef3b7d9a9..fa36b82ba 100644 --- a/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/multiple-gpus.robot +++ b/ods_ci/tests/Tests/0500__ide/0501__ide_jupyterhub/multiple-gpus.robot @@ -22,7 +22,7 @@ Verify Number Of Available GPUs Is Correct [Documentation] Verifies that the number of available GPUs in the ... Spawner dropdown is correct; i.e., it should show the maximum ... Number of GPUs available in a single node. - [Tags] Sanity Resources-2GPUS + [Tags] Sanity Resources-2GPUS NVIDIA-GPUs ... ODS-1256 ${maxNo} = Find Max Number Of GPUs In One Node ${maxSpawner} = Fetch Max Number Of GPUs In Spawner Page @@ -31,7 +31,7 @@ Verify Number Of Available GPUs Is Correct Verify Two Servers Can Be Spawned [Documentation] Spawns two servers requesting 1 gpu each, and checks ... that both can schedule and are scheduled on different nodes. - [Tags] Sanity Resources-2GPUS + [Tags] Sanity Resources-2GPUS NVIDIA-GPUs ... ODS-1257 Spawn Notebook With Arguments image=${NOTEBOOK_IMAGE} size=Small gpus=1 ${serial_first} = Get GPU Serial Number diff --git a/ods_ci/tests/Tests/0600__distributed_workloads/0602__training/test-run-tuning-stack-tests.robot b/ods_ci/tests/Tests/0600__distributed_workloads/0602__training/test-run-tuning-stack-tests.robot index 77b2714b6..ab05b15db 100644 --- a/ods_ci/tests/Tests/0600__distributed_workloads/0602__training/test-run-tuning-stack-tests.robot +++ b/ods_ci/tests/Tests/0600__distributed_workloads/0602__training/test-run-tuning-stack-tests.robot @@ -31,7 +31,7 @@ Run Training operator ODH test base LoRA use case # Run Training operator ODH test base QLoRA use case # [Documentation] Run Go ODH tests for Training operator base QLoRA use case # [Tags] RHOAIENG-13142 -# ... Resources-GPU +# ... Resources-GPU NVIDIA-GPUs # ... Tier1 # ... DistributedWorkloads # ... Training diff --git a/ods_ci/tests/Tests/0600__distributed_workloads/test-run-distributed-workloads-tests.robot b/ods_ci/tests/Tests/0600__distributed_workloads/test-run-distributed-workloads-tests.robot index 1149c38f9..511b0658d 100644 --- a/ods_ci/tests/Tests/0600__distributed_workloads/test-run-distributed-workloads-tests.robot +++ b/ods_ci/tests/Tests/0600__distributed_workloads/test-run-distributed-workloads-tests.robot @@ -25,7 +25,7 @@ Run TestKueueRayCpu ODH test Run TestKueueRayGpu ODH test [Documentation] Run Go ODH test: TestKueueRayGpu - [Tags] Resources-GPU + [Tags] Resources-GPU NVIDIA-GPUs ... Tier1 ... DistributedWorkloads ... Training @@ -43,7 +43,7 @@ Run TestRayTuneHPOCpu ODH test Run TestRayTuneHPOGpu ODH test [Documentation] Run Go ODH test: TestMnistRayTuneHpoGpu - [Tags] Resources-GPU + [Tags] Resources-GPU NVIDIA-GPUs ... Tier1 ... DistributedWorkloads ... Training @@ -62,7 +62,7 @@ Run TestKueueCustomRayCpu ODH test Run TestKueueCustomRayGpu ODH test [Documentation] Run Go ODH test: TestKueueCustomRayGpu [Tags] RHOAIENG-10013 - ... Resources-GPU + ... Resources-GPU NVIDIA-GPUs ... Tier1 ... DistributedWorkloads ... Training diff --git a/ods_ci/tests/Tests/1000__model_serving/1002__model_serving_modelmesh_gpu.robot b/ods_ci/tests/Tests/1000__model_serving/1002__model_serving_modelmesh_gpu.robot index 734fa115b..35934d4c8 100644 --- a/ods_ci/tests/Tests/1000__model_serving/1002__model_serving_modelmesh_gpu.robot +++ b/ods_ci/tests/Tests/1000__model_serving/1002__model_serving_modelmesh_gpu.robot @@ -25,7 +25,7 @@ ${RUNTIME_NAME}= Model Serving GPU Test *** Test Cases *** Verify GPU Model Deployment Via UI # robocop: off=too-long-test-case,too-many-calls-in-test-case [Documentation] Test the deployment of an openvino_ir model on a model server with GPUs attached - [Tags] Sanity Resources-GPU + [Tags] Sanity Resources-GPU NVIDIA-GPUs ... ODS-2214 Clean All Models Of Current User Open Data Science Projects Home Page @@ -57,7 +57,7 @@ Verify GPU Model Deployment Via UI # robocop: off=too-long-test-case,too-many Test Inference Load On GPU [Documentation] Test the inference load on the GPU after sending random requests to the endpoint - [Tags] Sanity Resources-GPU + [Tags] Sanity Resources-GPU NVIDIA-GPUs ... ODS-2213 ${url}= Get Model Route Via UI ${MODEL_NAME} Send Random Inference Request endpoint=${url} no_requests=100 diff --git a/ods_ci/tests/Tests/1000__model_serving/1005__model_serving_ovms_on_kserve.robot b/ods_ci/tests/Tests/1000__model_serving/1005__model_serving_ovms_on_kserve.robot index 29003c822..0c3da1869 100644 --- a/ods_ci/tests/Tests/1000__model_serving/1005__model_serving_ovms_on_kserve.robot +++ b/ods_ci/tests/Tests/1000__model_serving/1005__model_serving_ovms_on_kserve.robot @@ -104,7 +104,7 @@ Verify Multiple Projects With Same Model (OVMS on Kserve) Verify GPU Model Deployment Via UI (OVMS on Kserve) # robocop: off=too-long-test-case,too-many-calls-in-test-case [Documentation] Test the deployment of an openvino_ir model on a model server with GPUs attached - [Tags] Tier1 Resources-GPU + [Tags] Tier1 Resources-GPU NVIDIA-GPUs ... ODS-2630 ODS-2631 ProductBug RHOAIENG-3355 ${requests}= Create Dictionary nvidia.com/gpu=1 ${limits}= Create Dictionary nvidia.com/gpu=1 diff --git a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm.robot b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm.robot index f27404f72..5e3a67142 100644 --- a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm.robot +++ b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm.robot @@ -339,7 +339,7 @@ Verify User Can Set Requests And Limits For A Model # robocop: off=too-long-t Verify Model Can Be Served And Query On A GPU Node # robocop: off=too-long-test-case,too-many-calls-in-test-case [Documentation] Basic tests for preparing, deploying and querying a LLM model on GPU node ... using Kserve and Caikit+TGIS runtime - [Tags] Sanity ODS-2381 Resources-GPU + [Tags] Sanity ODS-2381 Resources-GPU NVIDIA-GPUs [Setup] Set Project And Runtime namespace=singlemodel-gpu ${test_namespace}= Set Variable singlemodel-gpu ${model_name}= Set Variable flan-t5-small-caikit diff --git a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_UI.robot b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_UI.robot index 566d0322d..0c0a8faf3 100644 --- a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_UI.robot +++ b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_UI.robot @@ -151,7 +151,7 @@ Verify User Can Set Requests And Limits For A Model Using The UI # robocop: o Verify Model Can Be Served And Query On A GPU Node Using The UI # robocop: off=too-long-test-case [Documentation] Basic tests for preparing, deploying and querying a LLM model on GPU node ... using Kserve and Caikit+TGIS runtime - [Tags] Sanity ODS-2523 Resources-GPU + [Tags] Sanity ODS-2523 Resources-GPU NVIDIA-GPUs [Setup] Set Up Project namespace=singlemodel-gpu ${test_namespace}= Set Variable singlemodel-gpu ${model_name}= Set Variable flan-t5-small-caikit diff --git a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_models.robot b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_models.robot index d7c8e25b7..97e2ea4f6 100644 --- a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_models.robot +++ b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_models.robot @@ -27,7 +27,7 @@ ${RUNTIME_IMAGE}= ${EMPTY} Verify User Can Serve And Query A bigscience/mt0-xxl Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve and TGIS runtime - [Tags] RHOAIENG-3477 Tier2 Resources-GPU + [Tags] RHOAIENG-3477 Tier2 Resources-GPU NVIDIA-GPUs Setup Test Variables model_name=mt0-xxl-hf use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -74,7 +74,7 @@ Verify User Can Serve And Query A bigscience/mt0-xxl Model # robocop: off=too Verify User Can Serve And Query A google/flan-t5-xl Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve and TGIS runtime - [Tags] RHOAIENG-3480 Tier2 Resources-GPU + [Tags] RHOAIENG-3480 Tier2 Resources-GPU NVIDIA-GPUs Setup Test Variables model_name=flan-t5-xl-hf use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} ${test_namespace}= Set Variable flant5xl-google @@ -122,7 +122,7 @@ Verify User Can Serve And Query A google/flan-t5-xl Model # robocop: off=too- Verify User Can Serve And Query A google/flan-t5-xxl Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve and TGIS runtime - [Tags] RHOAIENG-3481 Tier2 Resources-GPU + [Tags] RHOAIENG-3481 Tier2 Resources-GPU NVIDIA-GPUs Setup Test Variables model_name=flan-t5-xxl-hf use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} ${test_namespace}= Set Variable flant5xxl-google @@ -170,7 +170,7 @@ Verify User Can Serve And Query A google/flan-t5-xxl Model # robocop: off=too Verify User Can Serve And Query A elyza/elyza-japanese-llama-2-7b-instruct Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve and TGIS standalone or vllm runtime - [Tags] RHOAIENG-3479 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-3479 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=elyza-japanese use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=ELYZA-japanese-Llama-2-7b-instruct-hf Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -233,7 +233,7 @@ Verify User Can Serve And Query A elyza/elyza-japanese-llama-2-7b-instruct Model Verify User Can Serve And Query A ibm/mpt-7b-instruct2 Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... (mpt-7b-instruct2) using Kserve and TGIS runtime - [Tags] RHOAIENG-4201 Tier2 Resources-GPU + [Tags] RHOAIENG-4201 Tier2 Resources-GPU NVIDIA-GPUs Setup Test Variables model_name=mpt-7b-instruct2 use_pvc=${USE_PVC} use_gpu=${FALSE} ... kserve_mode=${KSERVE_MODE} ${test_namespace}= Set Variable mpt-7b-instruct2-ibm @@ -281,7 +281,7 @@ Verify User Can Serve And Query A ibm/mpt-7b-instruct2 Model # robocop: off=t Verify User Can Serve And Query A google/flan-ul-2 Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve and TGIS runtime - [Tags] RHOAIENG-3482 Tier2 Resources-GPU + [Tags] RHOAIENG-3482 Tier2 Resources-GPU NVIDIA-GPUs Setup Test Variables model_name=flan-ul2-hf use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=flan-ul2-hf ${test_namespace}= Set Variable flan-ul2-google @@ -329,7 +329,7 @@ Verify User Can Serve And Query A google/flan-ul-2 Model # robocop: off=too-l Verify User Can Serve And Query A codellama/codellama-34b-instruct-hf Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve and TGIS runtime - [Tags] RHOAIENG-4200 Tier2 Resources-GPU + [Tags] RHOAIENG-4200 Tier2 Resources-GPU NVIDIA-GPUs Setup Test Variables model_name=codellama-34b-instruct-hf use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=codellama-34b-instruct-hf ${test_namespace}= Set Variable codellama-34b @@ -369,7 +369,7 @@ Verify User Can Serve And Query A codellama/codellama-34b-instruct-hf Model # Verify User Can Serve And Query A meta-llama/llama-2-13b-chat Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve and TGIS standalone or vllm runtime - [Tags] RHOAIENG-3483 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-3483 VLLM Tier2 Resources-GPU NVIDIA-GPUs Setup Test Variables model_name=llama-2-13b-chat use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=Llama-2-13b-chat-hf Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -433,7 +433,7 @@ Verify User Can Serve And Query A google/flan-t5-xl Prompt Tuned Model # robo [Documentation] Tests for preparing, deploying and querying a prompt-tuned LLM model ... using Kserve and TGIS runtime. It uses a google/flan-t5-xl prompt-tuned ... to recognize customer complaints. - [Tags] RHOAIENG-3494 Tier2 Resources-GPU + [Tags] RHOAIENG-3494 Tier2 Resources-GPU NVIDIA-GPUs Setup Test Variables model_name=flan-t5-xl-hf-ptuned use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=flan-t5-xl-hf Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -494,7 +494,7 @@ Verify User Can Serve And Query A google/flan-t5-xl Prompt Tuned Model # robo Verify User Can Serve And Query A instructlab/merlinite-7b-lab Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-7690 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-7690 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=merlinite-7b-lab use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=merlinite-7b-lab Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -557,7 +557,7 @@ Verify User Can Serve And Query A instructlab/merlinite-7b-lab Model # roboco Verify User Can Serve And Query A ibm-granite/granite-8b-code-base Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-7689 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-7689 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=granite-8b-code use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=granite-8b-code-base Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -620,7 +620,7 @@ Verify User Can Serve And Query A ibm-granite/granite-8b-code-base Model # ro Verify User Can Serve And Query A intfloat/e5-mistral-7b-instruct Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-7427 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-7427 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=e5-mistral-7b use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=e5-mistral-7b-instruct Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -664,7 +664,7 @@ Verify User Can Serve And Query A intfloat/e5-mistral-7b-instruct Model # rob Verify User Can Serve And Query A meta-llama/llama-3-8B-Instruct Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve and TGIS standalone or vllm runtime - [Tags] RHOAIENG-8831 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-8831 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=llama-3-8b-chat use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=Meta-Llama-3-8B-Instruct Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -727,7 +727,7 @@ Verify User Can Serve And Query A meta-llama/llama-3-8B-Instruct Model # robo Verify User Can Serve And Query A ibm-granite/granite-3b-code-instruct Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-8819 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-8819 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=granite-8b-code use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=granite-3b-code-instruct Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -790,7 +790,7 @@ Verify User Can Serve And Query A ibm-granite/granite-3b-code-instruct Model Verify User Can Serve And Query A ibm-granite/granite-8b-code-instruct Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-8830 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-8830 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=granite-8b-code use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=granite-8b-code-instruct Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -853,7 +853,7 @@ Verify User Can Serve And Query A ibm-granite/granite-8b-code-instruct Model Verify User Can Serve And Query A ibm-granite/granite-7b-lab Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-8830 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-8830 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=granite-8b-code use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=granite-7b-lab Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -916,7 +916,7 @@ Verify User Can Serve And Query A ibm-granite/granite-7b-lab Model # robocop: Verify User Can Serve And Query A ibm-granite/granite-7b-lab ngram speculative Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-10162 VLLM + [Tags] RHOAIENG-10162 VLLM Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=granite-7b-lab use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=granite-7b-lab Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -982,7 +982,7 @@ Verify User Can Serve And Query A ibm-granite/granite-7b-lab ngram speculative M Verify User Can Serve And Query A microsoft/Phi-3-vision-128k-instruct vision Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-10164 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-10164 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=phi-3-vision use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=Phi-3-vision-128k-instruct Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -1028,7 +1028,7 @@ Verify User Can Serve And Query A microsoft/Phi-3-vision-128k-instruct vision Mo Verify User Can Serve And Query A meta-llama/llama-31-8B-Instruct Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve for vllm runtime - [Tags] RHOAIENG-10661 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-10661 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=llama-3-8b-chat use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=Meta-Llama-3.1-8B Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -1094,7 +1094,7 @@ Verify User Can Serve And Query A meta-llama/llama-31-8B-Instruct Model # rob Verify User Can Serve And Query RHAL AI granite-7b-starter Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using TGIS standalone or vllm runtime - [Tags] RHOAIENG-10154 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-10154 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=granite-7b-lab use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=granite-7b-starter Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} @@ -1157,7 +1157,7 @@ Verify User Can Serve And Query RHAL AI granite-7b-starter Model # robocop: o Verify User Can Serve And Query Granite-7b Speculative Decoding Using Draft Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using vllm runtime - [Tags] RHOAIENG-10163 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-10163 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=granite-7b-lab use_pvc=${FALSE} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=speculative_decoding IF "${RUNTIME_NAME}" == "tgis-runtime" @@ -1227,7 +1227,7 @@ Verify User Can Serve And Query Granite-7b Speculative Decoding Using Draft Mode Verify User Can Serve And Query RHAL AI Granite-7b-redhat-lab Model # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model ... using Kserve using vllm runtime - [Tags] RHOAIENG-10155 VLLM Tier2 Resources-GPU + [Tags] RHOAIENG-10155 VLLM Tier2 Resources-GPU NVIDIA-GPUs AMD-GPUs Setup Test Variables model_name=granite-7b-lab use_pvc=${USE_PVC} use_gpu=${USE_GPU} ... kserve_mode=${KSERVE_MODE} model_path=granite-7b-redhat-lab Set Project And Runtime runtime=${RUNTIME_NAME} namespace=${test_namespace} diff --git a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_other_runtimes_UI.robot b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_other_runtimes_UI.robot index 01eb5632f..8011e9e49 100644 --- a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_other_runtimes_UI.robot +++ b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_other_runtimes_UI.robot @@ -58,7 +58,7 @@ Verify Non Admin Can Serve And Query A Model Using The UI # robocop: off=too- Verify Model Can Be Served And Query On A GPU Node Using The UI # robocop: off=too-long-test-case,too-many-calls-in-test-case,line-too-long [Documentation] Basic tests for preparing, deploying and querying a LLM model on GPU node ... using Single-model platform and TGIS Standalone runtime. - [Tags] Sanity ODS-2612 Resources-GPU + [Tags] Sanity ODS-2612 Resources-GPU NVIDIA-GPUs [Setup] Run git clone https://github.com/IBM/text-generation-inference/ ${test_namespace}= Set Variable ${TEST_NS} ${isvc__name}= Set Variable flan-t5-small-hf-gpu @@ -84,7 +84,7 @@ Verify Model Can Be Served And Query On A GPU Node Using The UI # robocop: of Verify Model Can Be Served And Query On A GPU Node Using The UI For VLMM [Documentation] Basic tests for preparing, deploying and querying a LLM model on GPU node ... using Single-model platform with vllm runtime. - [Tags] Sanity RHOAIENG-6344 Resources-GPU + [Tags] Sanity RHOAIENG-6344 Resources-GPU NVIDIA-GPUs ${test_namespace}= Set Variable ${TEST_NS} ${isvc__name}= Set Variable gpt2-gpu ${model_name}= Set Variable gpt2 @@ -106,7 +106,7 @@ Verify Model Can Be Served And Query On A GPU Node Using The UI For VLMM Verify Embeddings Model Can Be Served And Query On A GPU Node Using The UI For VLMM [Documentation] Basic tests for preparing, deploying and querying a LLM model on GPU node ... using Single-model platform with vllm runtime. - [Tags] Sanity RHOAIENG-8832 Resources-GPU + [Tags] Sanity RHOAIENG-8832 Resources-GPU NVIDIA-GPUs ${test_namespace}= Set Variable ${TEST_NS} ${isvc__name}= Set Variable e5-mistral-7b-gpu ${model_name}= Set Variable e5-mistral-7b diff --git a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_tgis.robot b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_tgis.robot index cbf346e15..b2bdf4c7e 100644 --- a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_tgis.robot +++ b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1007__model_serving_llm_tgis.robot @@ -393,7 +393,7 @@ Verify User Can Set Requests And Limits For A Model # robocop: off=too-long-t Verify Model Can Be Served And Query On A GPU Node # robocop: off=too-long-test-case,too-many-calls-in-test-case [Documentation] Basic tests for preparing, deploying and querying a LLM model on GPU node ... using Kserve and Caikit+TGIS runtime - [Tags] Tier1 ODS-2381 Resources-GPU + [Tags] Tier1 ODS-2381 Resources-GPU NVIDIA-GPUs [Setup] Set Project And Runtime runtime=${TGIS_RUNTIME_NAME} namespace=singlemodel-gpu ${test_namespace}= Set Variable singlemodel-gpu ${model_name}= Set Variable flan-t5-small-caikit diff --git a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1008__model_serving_vllm/1008__model_serving_vllm_metrics.robot b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1008__model_serving_vllm/1008__model_serving_vllm_metrics.robot index 326d7668f..c1e713992 100644 --- a/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1008__model_serving_vllm/1008__model_serving_vllm_metrics.robot +++ b/ods_ci/tests/Tests/1000__model_serving/1007__model_serving_llm/1008__model_serving_vllm/1008__model_serving_vllm_metrics.robot @@ -57,7 +57,7 @@ ${TEST_NS}= vllm-gpt2 *** Test Cases *** Verify User Can Deploy A Model With Vllm Via CLI [Documentation] Deploy a model (gpt2) using the vllm runtime and confirm that it's running - [Tags] Sanity Resources-GPU RHOAIENG-6264 VLLM + [Tags] Sanity Resources-GPU NVIDIA-GPUs RHOAIENG-6264 VLLM ${rc} ${out}= Run And Return Rc And Output oc apply -f ${DL_POD_FILEPATH} Should Be Equal As Integers ${rc} ${0} Wait For Pods To Succeed label_selector=gpt-download-pod=true namespace=${TEST_NS} @@ -77,7 +77,7 @@ Verify User Can Deploy A Model With Vllm Via CLI Verify Vllm Metrics Are Present [Documentation] Confirm vLLM metrics are exposed in OpenShift metrics - [Tags] Sanity Resources-GPU RHOAIENG-6264 VLLM + [Tags] Sanity Resources-GPU NVIDIA-GPUs RHOAIENG-6264 VLLM Depends On Test Verify User Can Deploy A Model With Vllm Via CLI ${host}= llm.Get KServe Inference Host Via CLI isvc_name=vllm-gpt2-openai namespace=${TEST_NS} ${rc} ${out}= Run And Return Rc And Output curl -ks https://${host}/metrics/ @@ -91,7 +91,7 @@ Verify Vllm Metrics Are Present Verify Vllm Metrics Values Match Between UWM And Endpoint [Documentation] Confirm the values returned by UWM and by the model endpoint match for each metric - [Tags] Sanity Resources-GPU RHOAIENG-6264 RHOAIENG-7687 VLLM + [Tags] Sanity Resources-GPU NVIDIA-GPUs RHOAIENG-6264 RHOAIENG-7687 VLLM Depends On Test Verify User Can Deploy A Model With Vllm Via CLI Depends On Test Verify Vllm Metrics Are Present ${host}= llm.Get KServe Inference Host Via CLI isvc_name=vllm-gpt2-openai namespace=${TEST_NS}