Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHOAIENG-4198: Initial commit for kserve perf metrics #396

Merged
merged 2 commits into from
Aug 9, 2024

Conversation

syaseen-rh
Copy link
Contributor

@syaseen-rh syaseen-rh commented Aug 7, 2024

Description

Initial commit for KServe performance metrics feature.

  • Adding a procedure for viewing kserve metrics on the single-model serving platform

How Has This Been Tested?

Local build

Previews:

Monitoring Model Performance (Multi-model serving platform)

Screenshot 2024-08-07 at 2 04 19 PM Screenshot 2024-08-07 at 2 04 37 PM Screenshot 2024-08-07 at 2 05 00 PM

Monitoring Model Performance (Single-model serving platform)

Screenshot 2024-08-07 at 2 05 15 PM Screenshot 2024-08-07 at 2 05 27 PM

Copy link

@VedantMahabaleshwarkar VedantMahabaleshwarkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested some changes but it's hard for me to fully understand the changes based on the code alone. Can I see the generated output?


You can monitor the following metrics for a specific model that is deployed on the single-model serving platform:

* *Number of requests* - The number of HTTP requests that have failed or succeeded for a specific model.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* *Number of requests* - The number of HTTP requests that have failed or succeeded for a specific model.
* *Number of requests* - The number of requests that have failed or succeeded for a specific model.


* *Number of requests* - The number of HTTP requests that have failed or succeeded for a specific model.
* *Average response time (ms)* - The average time it takes a specific model to respond to requests.
* *CPU utilization (%)* - The percentage of the CPU's capacity that is currently being used by a specific model.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* *CPU utilization (%)* - The percentage of the CPU's capacity that is currently being used by a specific model.
* *CPU utilization (%)* - The percentage of the model deployment's CPU's limit that is currently being used by a specific model.

* *Number of requests* - The number of HTTP requests that have failed or succeeded for a specific model.
* *Average response time (ms)* - The average time it takes a specific model to respond to requests.
* *CPU utilization (%)* - The percentage of the CPU's capacity that is currently being used by a specific model.
* *Memory utilization (%)* - The percentage of the system's memory that is currently being used by a specific model.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* *Memory utilization (%)* - The percentage of the system's memory that is currently being used by a specific model.
* *Memory utilization (%)* - The percentage of the model deployment's memory limit that is currently being used by a specific model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VedantMahabaleshwarkar What does model deployment mean here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pod if there's only 1 replica, pods if there are multiple replicas. But what I specifically mean is the Openshift Deployment that represents the model. The deployment will have resource requests and limits. The % we show is the % relative to the limit that is set in the deployment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VedantMahabaleshwarkar I think that model deployment might sound confusing - proposing an alternative, let me know your thoughts:

  • CPU utilization (%): The percentage of the CPU limit per model replica that is currently utilized by a specific model.
  • Memory utilization (%) - The percentage of memory limit per model replica that is currently utilized by a specific model.

Based my assumption on this setting:

Screenshot 2024-08-08 at 9 53 06 AM

@VedantMahabaleshwarkar
Copy link

Previews look good to me, just some minor suggestions as made previously

@@ -10,7 +10,19 @@ You can view a graph that illustrates the HTTP requests that have failed or succ
.Prerequisites
* You have installed {productname-long}.
* On the OpenShift cluster where {productname-short} is installed, user workload monitoring is enabled.
* Your cluster administrator has _not_ edited the {productname-short} dashboard configuration to hide the *Endpoint Performance* tab on the *Model Serving* page. For more information, see link:{rhoaidocshome}/html/managing_resources/customizing-the-dashboard#ref-dashboard-configuration-options_dashboard[Dashboard configuration options].
* The following dashboard configuration options are set to their default values as shown:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default values?

@@ -0,0 +1,72 @@
:_module-type: PROCEDURE

[id="viewing-performance-metrics-for-model-server_{context}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the title and id be the same?

ifdef::upstream[]
* If you are using specialized {productname-short} groups, you are part of the user group or admin group (for example, {odh-user-group} or {odh-admin-group}) in OpenShift.
endif::[]
* The following dashboard configuration options are set to their default values as shown:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default values

@aduquett
Copy link
Contributor

aduquett commented Aug 8, 2024

A few minor comments, but otherwise LGTM.

@syaseen-rh syaseen-rh merged commit 3df813f into opendatahub-io:main Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants