Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENG-7635: Updates to clarify Red Hat support #329

Merged
merged 4 commits into from
Jun 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,39 @@ For a complete list of the default notebook images and their preinstalled packag
endif::[]

* You have sufficient resources. In addition to the base {productname-short} resources, you need 1.6 vCPU and 2 GiB memory to deploy the distributed workloads infrastructure.

ifndef::upstream[]
* The resources are physically available in the cluster.
+
[NOTE]
====
In {productname-short} {vernum}, {org-name} supports only a single cluster queue per cluster (that is, homogenous clusters), and only empty resource flavors.
For more information about Kueue resources, see link:{rhoaidocshome}{default-format-url}/working_with_distributed_workloads/overview-of-distributed-workloads_distributed-workloads#overview-of-kueue-resources_distributed-workloads[Overview of Kueue resources].
====
endif::[]
ifdef::upstream[]
* The resources are physically available in the cluster.
+
[NOTE]
====
{productname-short} currently supports only a single cluster queue per cluster (that is, homogenous clusters), and only empty resource flavors.
For more information about Kueue resources, see the link:https://kueue.sigs.k8s.io/docs/concepts/[Kueue documentation].
For more information about Kueue resources, see link:{odhdocshome}/working_with_distributed_workloads/#_overview-of-kueue-resources_distributed-workloads[Overview of Kueue resources].
====
endif::[]

ifndef::upstream[]
* If you want to use graphics processing units (GPUs), you have enabled GPU support in {productname-short}.
See link:{rhoaidocshome}{default-format-url}/managing_resources/managing-cluster-resources_cluster-mgmt#enabling-gpu-support_cluster-mgmt[Enabling GPU support in {productname-short}].
+
[NOTE]
====
In {productname-short} {vernum}, {org-name} supports only NVIDIA GPU accelerators for distributed workloads.
====
endif::[]
ifdef::upstream[]
* If you want to use graphics processing units (GPUs), you have enabled GPU support.
This process includes installing the Node Feature Discovery Operator and the NVIDIA GPU Operator.
For more information, see https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html[NVIDIA GPU Operator on {org-name} OpenShift Container Platform^] in the NVIDIA documentation.
endif::[]


.Procedure
Expand Down Expand Up @@ -106,11 +132,7 @@ You must specify a quota for each resource that the user can request, even if th

* Include the resource name in the `coveredResources` list.
* Specify the resource `name` and `nominalQuota` in the `flavors.resources` section, even if the `nominalQuota` value is 0.
+
[NOTE]
====
In this release of {productname-short}, the only accelerators supported for distributed workloads are NVIDIA GPUs.
====

.. Apply the configuration to create the `cluster-queue` object:
+
[source,bash]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ endif::[]
[NOTE]
====
Mutual Transport Layer Security (mTLS) is enabled by default in the CodeFlare component in {productname-short}.
In the current {productname-short} version, `submissionMode=K8sJobMode` is not supported in the Ray job specification, so the KubeRay Operator cannot create a submitter Kubernetes Job to submit the Ray job.
{productname-short} {vernum} does not support the `submissionMode=K8sJobMode` setting in the Ray job specification, so the KubeRay Operator cannot create a submitter Kubernetes Job to submit the Ray job.
Instead, users must configure the Ray job specification to set `submissionMode=HTTPMode` only, so that the KubeRay Operator sends a request to the RayCluster to create a Ray job.
====
* You have access to the data sets and models that the distributed workload uses.
Expand All @@ -48,6 +48,11 @@ endif::[]
ifndef::upstream[]
* If you want to use graphics processing units (GPUs), you have enabled GPU support in {productname-short}.
See link:{rhoaidocshome}{default-format-url}/managing_resources/managing-cluster-resources_cluster-mgmt#enabling-gpu-support_cluster-mgmt[Enabling GPU support in {productname-short}].
+
[NOTE]
====
In {productname-short} {vernum}, {org-name} supports only NVIDIA GPU accelerators for distributed workloads.
====
endif::[]
ifdef::upstream[]
* If you want to use graphics processing units (GPUs), you have enabled GPU support.
Expand Down
6 changes: 5 additions & 1 deletion modules/overview-of-kueue-resources.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,14 @@ spec:

----


ifndef::upstream[]

[NOTE]
====
{productname-short} currently supports only a single cluster queue per cluster (that is, homogenous clusters), and only empty resource flavors.
In {productname-short} {vernum}, {org-name} supports only a single cluster queue per cluster (that is, homogenous clusters), and only empty resource flavors.
====
endif::[]


For more information about configuring resource flavors, see link:https://kueue.sigs.k8s.io/docs/concepts/resource_flavor/[Resource Flavor] in the Kueue documentation.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,9 @@ To run a distributed data science workload in a disconnected environment, you mu

ifndef::upstream[]
* You have created a data science project that contains a workbench, and the workbench is running a default notebook image that contains the CodeFlare SDK, for example, the *Standard Data Science* notebook. For information about how to create a project, see link:{rhoaidocshome}/working_on_data_science_projects/working-on-data-science-projects_nb-server#creating-a-data-science-project_nb-server[Creating a data science project].
For a complete list of the default notebook images and their preinstalled packages, see the table in link:{rhoaidocshome}/working_on_data_science_projects/creating-and-importing-notebooks_notebooks#notebook-images-for-data-scientists_notebooks[Notebook images for data scientists].
endif::[]
ifdef::upstream[]
* You have created a data science project that contains a workbench, and the workbench is running a default notebook image that contains the CodeFlare SDK, for example, the *Standard Data Science* notebook. For information about how to create a project, see link:{odhdocshome}/working-on-data-science-projects/#_using_data_science_projects[Creating a data science project].
For a complete list of the default notebook images and their preinstalled packages, see the table in link:{odhdocshome}/working-on-data-science-projects/#_using_data_science_projects[Notebook images for data scientists].
endif::[]

* You have Admin access for the data science project.
Expand Down