Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Restructure how-to section to make it more readable #3147

Merged
merged 25 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 158 additions & 3 deletions .gitbook.yaml

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/book/component-guide/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Overview of categories of MLOps components and third-party integrat

If you are new to the world of MLOps, it is often daunting to be immediately faced with a sea of tools that seemingly all promise and do the same things. It is useful in this case to try to categorize tools in various groups in order to understand their value in your toolchain in a more precise manner.

ZenML tackles this problem by introducing the concept of [Stacks and Stack Components](../user-guide/production-guide/understand-stacks.md). These stack components represent categories, each of which has a particular function in your MLOps pipeline. ZenML realizes these stack components as base abstractions that standardize the entire workflow for your team. In order to then realize the benefit, one can write a concrete implementation of the [abstraction](../how-to/stack-deployment/implement-a-custom-stack-component.md), or use one of the many built-in [integrations](README.md) that implement these abstractions for you.
ZenML tackles this problem by introducing the concept of [Stacks and Stack Components](../user-guide/production-guide/understand-stacks.md). These stack components represent categories, each of which has a particular function in your MLOps pipeline. ZenML realizes these stack components as base abstractions that standardize the entire workflow for your team. In order to then realize the benefit, one can write a concrete implementation of the [abstraction](../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md), or use one of the many built-in [integrations](README.md) that implement these abstractions for you.

Here is a full list of all stack components currently supported in ZenML, with a description of the role of that component in the MLOps process:

Expand All @@ -30,7 +30,7 @@ Each pipeline run that you execute with ZenML will require a **stack** and each

## Writing custom component flavors

You can take control of how ZenML behaves by creating your own components. This is done by writing custom component `flavors`. To learn more, head over to [the general guide on writing component flavors](../how-to/stack-deployment/implement-a-custom-stack-component.md), or read more specialized guides for specific component types (e.g. the [custom orchestrator guide](orchestrators/custom.md)).
You can take control of how ZenML behaves by creating your own components. This is done by writing custom component `flavors`. To learn more, head over to [the general guide on writing component flavors](../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md), or read more specialized guides for specific component types (e.g. the [custom orchestrator guide](orchestrators/custom.md)).

## Integrations

Expand Down
4 changes: 2 additions & 2 deletions docs/book/component-guide/alerters/custom.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: Learning how to develop a custom alerter.
# Develop a Custom Alerter

{% hint style="info" %}
Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](../../how-to/stack-deployment/implement-a-custom-stack-component.md). This guide provides an essential understanding of ZenML's component flavor concepts.
Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](../../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md). This guide provides an essential understanding of ZenML's component flavor concepts.
{% endhint %}

### Base Abstraction
Expand Down Expand Up @@ -119,7 +119,7 @@ zenml alerter flavor register flavors.my_flavor.MyAlerterFlavor
```

{% hint style="warning" %}
ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository.
ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/project-setup-and-management/setting-up-a-project-repository/set-up-repository.md) of initializing zenml at the root of your repository.

If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root.
{% endhint %}
Expand Down
2 changes: 1 addition & 1 deletion docs/book/component-guide/annotators/custom.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: Learning how to develop a custom annotator.
# Develop a Custom Annotator

{% hint style="info" %}
Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](../../how-to/stack-deployment/implement-a-custom-stack-component.md). This guide provides an essential understanding of ZenML's component flavor concepts.
Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](../../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md). This guide provides an essential understanding of ZenML's component flavor concepts.
{% endhint %}

Annotators are a stack component that enables the use of data annotation as part of your ZenML stack and pipelines. You can use the associated CLI command to launch annotation, configure your datasets and get stats on how many labeled tasks you have ready for use.
Expand Down
12 changes: 6 additions & 6 deletions docs/book/component-guide/artifact-stores/artifact-stores.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,17 @@ The Artifact Store is a central component in any MLOps stack. As the name sugges
ZenML automatically serializes and saves the data circulated through your pipelines in the Artifact Store: datasets, models, data profiles, data and model validation reports, and generally any object that is returned by a pipeline step. This is coupled with tracking in ZenML to provide extremely useful features such as caching and provenance/lineage tracking and pipeline reproducibility.

{% hint style="info" %}
Not all objects returned by pipeline steps are physically stored in the Artifact Store, nor do they have to be. How artifacts are serialized and deserialized and where their contents are stored are determined by the particular implementation of the [Materializer](../../how-to/handle-data-artifacts/handle-custom-data-types.md) associated with the artifact data type. The majority of Materializers shipped with ZenML use the Artifact Store which is part of the active Stack as the location where artifacts are kept.
Not all objects returned by pipeline steps are physically stored in the Artifact Store, nor do they have to be. How artifacts are serialized and deserialized and where their contents are stored are determined by the particular implementation of the [Materializer](../../how-to/data-artifact-management/handle-data-artifacts/handle-custom-data-types.md) associated with the artifact data type. The majority of Materializers shipped with ZenML use the Artifact Store which is part of the active Stack as the location where artifacts are kept.

If you need to store _a particular type of pipeline artifact_ in a different medium (e.g. use an external model registry to store model artifacts, or an external data lake or data warehouse to store dataset artifacts), you can write your own [Materializer](../../how-to/handle-data-artifacts/handle-custom-data-types.md) to implement the custom logic required for it. In contrast, if you need to use an entirely different storage backend to store artifacts, one that isn't already covered by one of the ZenML integrations, you can [extend the Artifact Store abstraction](custom.md) to provide your own Artifact Store implementation.
If you need to store _a particular type of pipeline artifact_ in a different medium (e.g. use an external model registry to store model artifacts, or an external data lake or data warehouse to store dataset artifacts), you can write your own [Materializer](../../how-to/data-artifact-management/handle-data-artifacts/handle-custom-data-types.md) to implement the custom logic required for it. In contrast, if you need to use an entirely different storage backend to store artifacts, one that isn't already covered by one of the ZenML integrations, you can [extend the Artifact Store abstraction](custom.md) to provide your own Artifact Store implementation.
{% endhint %}

In addition to pipeline artifacts, the Artifact Store may also be used as storage backed by other specialized stack components that need to store their data in the form of persistent object storage. The [Great Expectations Data Validator](../data-validators/great-expectations.md) is such an example.

Related concepts:

* the Artifact Store is a type of Stack Component that needs to be registered as part of your ZenML [Stack](../../user-guide/production-guide/understand-stacks.md).
* the objects circulated through your pipelines are serialized and stored in the Artifact Store using [Materializers](../../how-to/handle-data-artifacts/handle-custom-data-types.md). Materializers implement the logic required to serialize and deserialize the artifact contents and to store them and retrieve their contents to/from the Artifact Store.
* the objects circulated through your pipelines are serialized and stored in the Artifact Store using [Materializer](../../how-to/data-artifact-management/handle-data-artifacts/handle-custom-data-types.md). Materializers implement the logic required to serialize and deserialize the artifact contents and to store them and retrieve their contents to/from the Artifact Store.

### When to use it

Expand Down Expand Up @@ -57,11 +57,11 @@ zenml artifact-store register s3_store -f s3 --path s3://my_bucket
The Artifact Store provides low-level object storage services for other ZenML mechanisms. When you develop ZenML pipelines, you normally don't even have to be aware of its existence or interact with it directly. ZenML provides higher-level APIs that can be used as an alternative to store and access artifacts:

* return one or more objects from your pipeline steps to have them automatically saved in the active Artifact Store as pipeline artifacts.
* [retrieve pipeline artifacts](../../how-to/handle-data-artifacts/load-artifacts-into-memory.md) from the active Artifact Store after a pipeline run is complete.
* [retrieve pipeline artifacts](../../how-to/data-artifact-management/handle-data-artifacts/load-artifacts-into-memory.md) from the active Artifact Store after a pipeline run is complete.

You will probably need to interact with the [low-level Artifact Store API](artifact-stores.md#the-artifact-store-api) directly:

* if you implement custom [Materializers](../../how-to/handle-data-artifacts/handle-custom-data-types.md) for your artifact data types
* if you implement custom [Materializers](../../how-to/data-artifact-management/handle-data-artifacts/handle-custom-data-types.md) for your artifact data types
* if you want to store custom objects in the Artifact Store

#### The Artifact Store API
Expand Down Expand Up @@ -91,7 +91,7 @@ with fileio.open(artifact_uri, "w") as f:
f.write(artifact_contents)
```

When using the Artifact Store API to write custom Materializers, the base artifact URI path is already provided. See the documentation on [Materializers](../../how-to/handle-data-artifacts/handle-custom-data-types.md) for an example.
When using the Artifact Store API to write custom Materializers, the base artifact URI path is already provided. See the documentation on [Materializers](../../how-to/data-artifact-management/handle-data-artifacts/handle-custom-data-types.md) for an example.
{% endhint %}

The following are some code examples showing how to use the Artifact Store API for various operations:
Expand Down
14 changes: 7 additions & 7 deletions docs/book/component-guide/artifact-stores/azure.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ You should use the Azure Artifact Store when you decide to keep your ZenML artif
{% hint style="info" %}
Would you like to skip ahead and deploy a full ZenML cloud stack already,
including an Azure Artifact Store? Check out the
[in-browser stack deployment wizard](../../how-to/stack-deployment/deploy-a-cloud-stack.md),
the [stack registration wizard](../../how-to/stack-deployment/register-a-cloud-stack.md),
or [the ZenML Azure Terraform module](../../how-to/stack-deployment/deploy-a-cloud-stack-with-terraform.md)
[in-browser stack deployment wizard](../../how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack.md),
the [stack registration wizard](../../how-to/infrastructure-deployment/stack-deployment/register-a-cloud-stack.md),
or [the ZenML Azure Terraform module](../../how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md)
for a shortcut on how to deploy & register this stack component.
{% endhint %}

Expand All @@ -52,7 +52,7 @@ Depending on your use case, however, you may also need to provide additional con

#### Authentication Methods

Integrating and using an Azure Artifact Store in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Implicit Authentication_ method. However, the recommended way to authenticate to the Azure cloud platform is through [an Azure Service Connector](../../how-to/auth-management/azure-service-connector.md). This is particularly useful if you are configuring ZenML stacks that combine the Azure Artifact Store with other remote stack components also running in Azure.
Integrating and using an Azure Artifact Store in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Implicit Authentication_ method. However, the recommended way to authenticate to the Azure cloud platform is through [an Azure Service Connector](../../how-to/infrastructure-deployment/auth-management/azure-service-connector.md). This is particularly useful if you are configuring ZenML stacks that combine the Azure Artifact Store with other remote stack components also running in Azure.

You will need the following information to configure Azure credentials for ZenML, depending on which type of Azure credentials you want to use:

Expand Down Expand Up @@ -81,12 +81,12 @@ The implicit authentication method also needs to be coordinated with other stack
* [Step Operators](../step-operators/step-operators.md) need to access the Artifact Store to manage step-level artifacts
* [Model Deployers](../model-deployers/model-deployers.md) need to access the Artifact Store to load served models

To enable these use cases, it is recommended to use [an Azure Service Connector](../../how-to/auth-management/azure-service-connector.md) to link your Azure Artifact Store to the remote Azure Blob storage container.
To enable these use cases, it is recommended to use [an Azure Service Connector](../../how-to/infrastructure-deployment/auth-management/azure-service-connector.md) to link your Azure Artifact Store to the remote Azure Blob storage container.
{% endhint %}
{% endtab %}

{% tab title="Azure Service Connector (recommended)" %}
To set up the Azure Artifact Store to authenticate to Azure and access an Azure Blob storage container, it is recommended to leverage the many features provided by [the Azure Service Connector](../../how-to/auth-management/azure-service-connector.md) such as auto-configuration, best security practices regarding long-lived credentials and reusing the same credentials across multiple stack components.
To set up the Azure Artifact Store to authenticate to Azure and access an Azure Blob storage container, it is recommended to leverage the many features provided by [the Azure Service Connector](../../how-to/infrastructure-deployment/auth-management/azure-service-connector.md) such as auto-configuration, best security practices regarding long-lived credentials and reusing the same credentials across multiple stack components.

If you don't already have an Azure Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure an Azure Service Connector that can be used to access more than one Azure blob storage container or even more than one type of Azure resource:

Expand All @@ -112,7 +112,7 @@ Successfully registered service connector `azure-blob-demo` with access to the f
```
{% endcode %}

> **Note**: Please remember to grant the Azure service principal permissions to read and write to your Azure Blob storage container as well as to list accessible storage accounts and Blob containers. For a full list of permissions required to use an AWS Service Connector to access one or more S3 buckets, please refer to the [Azure Service Connector Blob storage container resource type documentation](../../how-to/auth-management/azure-service-connector.md#azure-blob-storage-container) or read the documentation available in the interactive CLI commands and dashboard. The Azure Service Connector supports [many different authentication methods](../../how-to/auth-management/azure-service-connector.md#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use-case.
> **Note**: Please remember to grant the Azure service principal permissions to read and write to your Azure Blob storage container as well as to list accessible storage accounts and Blob containers. For a full list of permissions required to use an AWS Service Connector to access one or more S3 buckets, please refer to the [Azure Service Connector Blob storage container resource type documentation](../../how-to/infrastructure-deployment/auth-management/azure-service-connector.md#azure-blob-storage-container) or read the documentation available in the interactive CLI commands and dashboard. The Azure Service Connector supports [many different authentication methods](../../how-to/infrastructure-deployment/auth-management/azure-service-connector.md#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use-case.

If you already have one or more Azure Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the Azure Blob storage container you want to use for your Azure Artifact Store by running e.g.:

Expand Down
2 changes: 1 addition & 1 deletion docs/book/component-guide/artifact-stores/custom.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: Learning how to develop a custom artifact store.
# Develop a custom artifact store

{% hint style="info" %}
Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](../../how-to/stack-deployment/implement-a-custom-stack-component.md). This guide provides an essential understanding of ZenML's component flavor concepts.
Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](../../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md). This guide provides an essential understanding of ZenML's component flavor concepts.
{% endhint %}

ZenML comes equipped with [Artifact Store implementations](./artifact-stores.md#artifact-store-flavors) that you can use to store artifacts on a local filesystem or in the managed AWS, GCP, or Azure cloud object storage services. However, if you need to use a different type of object storage service as a backend for your ZenML Artifact Store, you can extend ZenML to provide your own custom Artifact Store implementation.
Expand Down
Loading