Skip to content

Commit

Permalink
incorporate peer review feedback (#49)
Browse files Browse the repository at this point in the history
* incorporate peer review feedback

* final peer review comments from downstream
  • Loading branch information
MelissaFlinn authored Oct 1, 2024
1 parent f307763 commit 7b42799
Show file tree
Hide file tree
Showing 20 changed files with 35 additions and 34 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -65,13 +65,13 @@ image::pipelines/wb-pipeline-node-1.png[Select Node 1, 150]

. Scroll down to the *File Dependencies* section and then click *Add*.
+
image::pipelines/wb-pipeline-node-1-file-dep.png[Add File Dependency, 400]
image::pipelines/wb-pipeline-node-1-file-dep.png[Add File Dependency, 500]

. Set the value to `data/*.csv` which contains the data to train your model.

. Select the *Include Subdirectories* option and then click *Add*.
+
image::pipelines/wb-pipeline-node-1-file-dep-form.png[Set File Dependency Value, 300]
image::pipelines/wb-pipeline-node-1-file-dep-form.png[Set File Dependency Value, 500]

. Save the pipeline.

Expand Down Expand Up @@ -168,7 +168,7 @@ image::pipelines/wb-pipeline-kube-secret-form.png[Secret Form, 300]

== Run the Pipeline

Upload the pipeline on your cluster and run it. You can do so directly from the pipeline editor. You can use your own newly created pipeline for this or `6 Train Save.pipeline`.
Upload the pipeline on your cluster and run it. You can do so directly from the pipeline editor. You can use your own newly created pipeline or the pipeline provided in the `6 Train Save.pipeline` file.

.Procedure

Expand Down
3 changes: 2 additions & 1 deletion workshop/docs/modules/ROOT/pages/conclusion.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,11 @@
[.text-center.strong]
== Conclusion

Congratulations. In this {deliverable}, you learned how to incorporate data science, artificial intelligence, and machine learning into an OpenShift development workflow.
Congratulations. In this {deliverable}, you learned how to incorporate data science, artificial intelligence, and machine learning into an OpenShift development workflow.

You used an example fraud detection model and completed the following tasks:

* Explored a pre-trained fraud detection model by using a Jupyter notebook.
* Deployed the model by using {productname-short} model serving.
* Refined and trained the model by using automated pipelines.
* Learned how to train the model by using Ray, a distributed computing framework.
2 changes: 1 addition & 1 deletion workshop/docs/modules/ROOT/pages/creating-a-workbench.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ A workbench is an instance of your development and experimentation environment.

. Click the *Workbenches* tab, and then click the *Create workbench* button.
+
image::workbenches/ds-project-create-workbench.png[Create workbench button, 300]
image::workbenches/ds-project-create-workbench.png[Create workbench button, 600]

. Fill out the name and description.
+
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
[id='creating-data-connections-to-storage']
= Creating data connections to your own S3-compatible object storage

If you have existing S3-compatible storage buckets that you want to use for this {deliverable}, you must create a data connection to one storage bucket for saving your data and models and, if you want to complete the pipelines section of this {deliverable}, create another data connection to a different storage bucket for saving pipeline artifacts.
If you have existing S3-compatible storage buckets that you want to use for this {deliverable}, you must create a data connection to one storage bucket for saving your data and models. If you want to complete the pipelines section of this {deliverable}, create another data connection to a different storage bucket for saving pipeline artifacts.

NOTE: If you do not have your own s3-compatible storage, or if you want to use a disposable local Minio instance instead, skip this section and follow the steps in xref:running-a-script-to-install-storage.adoc[Running a script to install local object storage buckets and create data connections]. The provided script automatically completes the following tasks for you: creates a Minio instance in your project, creates two storage buckets in that Minio instance, creates two data connections in your project, one for each bucket and both using the same credentials, and installs required network policies for service mesh functionality.
NOTE: If you do not have your own s3-compatible storage, or if you want to use a disposable local Minio instance instead, skip this section and follow the steps in xref:running-a-script-to-install-storage.adoc[Running a script to install local object storage buckets and create data connections]. The provided script automatically completes the following tasks for you: creates a Minio instance in your project, creates two storage buckets in that Minio instance, creates two data connections in your project, one for each bucket and both using the same credentials, and installs required network policies for service mesh functionality.

.Prerequisite
.Prerequisites

To create data connections to your existing S3-compatible storage buckets, you need the following credential information for the storage buckets:

Expand All @@ -27,11 +27,11 @@ If you don't have this information, contact your storage administrator.
+
image::projects/ds-project-add-dc.png[Add data connection]

.. Fill out the *Add data connection* form and name your connection *My Storage*. This connection is for saving your personal work, including data and models.
.. Complete the *Add data connection* form and name your connection *My Storage*. This connection is for saving your personal work, including data and models.
+
NOTE: Skip the *Connected workbench* item. You add data connections to a workbench in a later section.
+
image::projects/ds-project-my-storage-form.png[Add my storage form, 400]
image::projects/ds-project-my-storage-form.png[Add my storage form, 500]

.. Click *Add data connection*.

Expand All @@ -41,11 +41,11 @@ NOTE: If you do not intend to complete the pipelines section of the {deliverable

.. Click *Add data connection*.

.. Fill out the form and name your connection *Pipeline Artifacts*.
.. Complete the form and name your connection *Pipeline Artifacts*.
+
NOTE: Skip the *Connected workbench* item. You add data connections to a workbench in a later section.
+
image::projects/ds-project-pipeline-artifacts-form.png[Add pipeline artifacts form, 400]
image::projects/ds-project-pipeline-artifacts-form.png[Add pipeline artifacts form, 500]

.. Click *Add data connection*.

Expand All @@ -54,7 +54,7 @@ image::projects/ds-project-pipeline-artifacts-form.png[Add pipeline artifacts fo

In the *Data connections* tab for the project, check to see that your data connections are listed.

image::projects/ds-project-dc-list.png[List of project data connections, 400]
image::projects/ds-project-dc-list.png[List of project data connections, 500]


.Next steps
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

{productname-short} multi-model servers can host several models at once. You create a new model server and deploy your model to it.

.Prerequisite
.Prerequisites

* A user with `admin` privileges has enabled the multi-model serving platform on your OpenShift cluster.

Expand Down Expand Up @@ -43,7 +43,7 @@ image::model-serving/deploy-model-form-mm.png[Deploy model from for multi-model

.Verification

Notice the loading symbol under the *Status* section. It will change to a green checkmark when the deployment is completes successfully.
Notice the loading symbol under the *Status* section. The symbol changes to a green checkmark when the deployment completes successfully.

image::model-serving/ds-project-model-list-status-mm.png[Deployed model status]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@

{productname-short} single-model servers host only one model. You create a new model server and deploy your model to it.

NOTE: Depending on how model serving has been configured on your cluster, you might see only one model serving platform option.


.Prerequisite
.Prerequisites

* A user with `admin` privileges has enabled the single-model serving platform on your OpenShift cluster.

Expand All @@ -27,13 +24,13 @@ NOTE: Depending on how model serving has been configured on your cluster, you mi
.. Type the path that leads to the version folder that contains your model file: `models/fraud`
.. Leave the other fields with the default settings.
+
image::model-serving/deploy-model-form-sm.png[Deploy model from for single-model serving, 400]
image::model-serving/deploy-model-form-sm.png[Deploy model form for single-model serving, 500]

. Click *Deploy*.

.Verification

Notice the loading symbol under the *Status* section. It will change to a green checkmark when the deployment is completes successfully.
Notice the loading symbol under the *Status* section. The symbol changes to a green checkmark when the deployment completes successfully.

image::model-serving/ds-project-model-list-status-sm.png[Deployed model status]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This section demonstrates how you can use Ray to distribute the training of a ma

In your notebook environment, open the `8_distributed_training.ipynb` file and follow the instructions directly in the notebook. The instructions guide you through setting authentication, creating Ray clusters, and working with jobs.

Optionally, if you want to view the python code for this section, you can find it in the `ray-scripts/train_tf_cpu.py` directory.
Optionally, if you want to view the Python code for this section, you can find it in the `ray-scripts/train_tf_cpu.py` file.

image::distributed/jupyter-notebook.png[Jupyter Notebook]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ image::projects/ds-project-create-pipeline-server-form.png[Selecting the Pipelin

. Click *Configure pipeline server*.

. Wait until the spinner disappears and *Start by importing a pipeline* is displayed.
. Wait until the loading spinner disappears and *Start by importing a pipeline* is displayed.
+
[IMPORTANT]
====
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

The Jupyter environment is a web-based environment, but everything you do inside it happens on *{productname-long}* and is powered by the *OpenShift* cluster. This means that, without having to install and maintain anything on your own computer, and without disposing of valuable local resources such as CPU, GPU and RAM, you can conduct your Data Science work in this powerful and stable managed environment.

.Prerequisite
.Prerequisites

You created a workbench, as described in xref:creating-a-workbench.adoc[Creating a workbench and selecting a Notebook image].

Expand Down
3 changes: 2 additions & 1 deletion workshop/docs/modules/ROOT/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,9 @@ You will use an example fraud detection model to complete the following tasks:
* Explore a pre-trained fraud detection model by using a Jupyter notebook.
* Deploy the model by using {productname-short} model serving.
* Refine and train the model by using automated pipelines.
* Learn how to train the model by using Ray, a distributed computing framework.
And you do not have to install anything on your own computer, thanks to https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai[{productname-long}].
You do not have to install anything on your own computer, thanks to https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai[{productname-long}].

== About the example fraud detection model

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The {productname-short} dashboard shows the *Home* page.

NOTE: You can navigate back to the OpenShift console by clicking the application launcher to access the OpenShift console.

image::projects/ds-console-ocp-tile.png[OCP console link, 250]
image::projects/ds-console-ocp-tile.png[OCP console link, 400]

For now, stay in the {productname-short} dashboard.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ After you train a model, you can deploy it by using the {productname-short} mode

To prepare a model for deployment, you must complete the following tasks:

* Move the model from your workbench to your S3-compatible object storage. You use the data connection that you created in the xref:storing-data-with-data-connections.adoc[Storing data with data connections] section and upload the model from a notebook.
* Move the model from your workbench to your S3-compatible object storage. Use the data connection that you created in the xref:storing-data-with-data-connections.adoc[Storing data with data connections] section and upload the model from a notebook.

* Convert the model to the portable ONNX format. ONNX allows you to transfer models between frameworks with minimal preparation and without the need for rewriting the models.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

In the previous section, you created a simple pipeline by using the GUI pipeline editor. It's often desirable to create pipelines by using code that can be version-controlled and shared with others. The https://github.com/kubeflow/pipelines[Kubeflow pipelines (kfp)] SDK provides a Python API for creating pipelines. The SDK is available as a Python package that you can install by using the `pip install kfp` command. With this package, you can use Python code to create a pipeline and then compile it to YAML format. Then you can import the YAML code into {productname-short}.

This {deliverable} does not delve into the details of how to use the SDK. Instead, it provides the files for you to view and upload.
This {deliverable} does not describe the details of how to use the SDK. Instead, it provides the files for you to view and upload.

. Optionally, view the provided Python code in your Jupyter environment by navigating to the `fraud-detection-notebooks` project's `pipeline` directory. It contains the following files:
+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ IMPORTANT: The Minio-based Object Storage that the script creates is *not* meant

NOTE: If you want to connect to your own storage, see xref:creating-data-connections-to-storage.adoc[Creating data connections to your own S3-compatible object storage].

.Prerequisite
.Prerequisites

You must know the OpenShift resource name for your data science project so that you run the provided script in the correct project. To get the project's resource name:

Expand All @@ -37,7 +37,7 @@ oc apply -n <your-project-name/> -f https://github.com/rh-aiservices-bu/fraud-de

. In the {productname-short} dashboard, click the application launcher icon and then select the *OpenShift Console* option.
+
image::projects/ds-project-ocp-link.png[OpenShift Console Link]
image::projects/ds-project-ocp-link.png[OpenShift Console Link, 600]

. In the OpenShift console, click *+* in the top navigation bar.
+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ You can run a code cell from the notebook interface or from the keyboard:

* *From the user interface:* Select the cell (by clicking inside the cell or to the left side of the cell) and then click *Run* from the toolbar.
+
image::workbenches/run_button.png[Jupyter Run, 75]
image::workbenches/run_button.png[Jupyter Run, 100]

* *From the keyboard:* Press `CTRL` + `ENTER` to run a cell or press `SHIFT` + `ENTER` to run the cell and automatically select the next one.

Expand All @@ -35,7 +35,7 @@ Notebooks are so named because they are like a physical _notebook_: you can take

Now that you know the basics, give it a try.

.Prerequisite
.Prerequisites

* You have imported the {deliverable} files into your Jupyter environment as described in
xref:importing-files-into-jupyter.adoc[Importing the {deliverable} files into the Jupyter environment].
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[id='storing-data-with-data-connections']
= Storing data with data connections

Add data connections to workbenches to connect your project to data inputs and object storage buckets. A data connection is a resource that contains the configuration parameters needed to connect to an object storage bucket.
Add data connections to workbenches if you want to connect your project to data inputs and object storage buckets. A data connection is a resource that contains the configuration parameters needed to connect to an object storage bucket.

For this {deliverable}, you need two S3-compatible object storage buckets, such as Ceph, Minio, or AWS S3. You can use your own storage buckets or run a provided script that creates the following local Minio storage buckets for you:

Expand All @@ -12,4 +12,6 @@ Also, you must create a data connection to each storage bucket. You have two opt

* If you want to use your own S3-compatible object storage buckets, create data connections to them as described in xref:creating-data-connections-to-storage.adoc[Creating data connections to your own S3-compatible object storage].

* If you want to run a script that installs local Minio storage buckets and creates data connections to them, follow the steps in xref:running-a-script-to-install-storage.adoc[Running a script to install local object storage buckets and create data connections].
* If you want to run a script that installs local Minio storage buckets and creates data connections to them, follow the steps in xref:running-a-script-to-install-storage.adoc[Running a script to install local object storage buckets and create data connections].

NOTE: While it is possible for you to use one storage bucket for both purposes (storing models and data as well as storing pipeline artifacts), this tutorial follows best practice and uses separate storage buckets for each purpose.

0 comments on commit 7b42799

Please sign in to comment.