This guide explains how to use the load_reference_data_from_whylabs
function in your project to load a reference profile from WhyLabs. The function fetches the profile using the WhyLabs API and provides it as a DatasetProfileView
object for use in data validation and monitoring.
To use the function, you need to set the following environment variables:
WHYLABS_DEFAULT_ORG_ID
: Your WhyLabs organization ID.WHYLABS_DEFAULT_DATASET_ID
: Your WhyLabs dataset (model) ID.WHYLABS_API_KEY
: Your WhyLabs API key for authentication.WHYLABS_REF_ID
: The unique reference profile ID.
WHYLABS_API_KEY
- Go to the WhyLabs Hub.
- Navigate to Menu → Settings → Manage API tokens.
- Create or copy an API token for use in your project.
WHYLABS_REF_ID
- Open your WhyLabs Dashboard.
- Locate the reference profile you want to use.
- Refer to the screenshot below:
For more information about working with environment variables in Valohai, refer to the Valohai Environment Variables Documentation.
Ensure the following libraries are installed:
pip install whylabs-client whylogs
When running the inference step, specify the use_whylabs_reference_profile
parameter:
use_whylabs_reference_profile
: Indicates whether to load the reference profile from WhyLabs. If set toFalse
, the reference profile will be created dynamically using the input data provided in theref_data
directory.
The load_reference_data_from_whylabs
function performs the following steps:
-
Environment Variable Loading:
- The function retrieves
ORG_ID
,MODEL_ID
,API_KEY
, andREF_ID
from the environment. If any are missing, it raises an error.
- The function retrieves
-
API Client Configuration:
- Configures the WhyLabs API client using
whylabs_client.Configuration
. The API key is used for authentication.
- Configures the WhyLabs API client using
-
Fetching Metadata:
- Calls
get_reference_profile
fromDatasetProfileApi
to retrieve metadata for the reference profile.
- Calls
-
Downloading the Profile:
- Uses the
download_url
from the API response to fetch the reference profile as binary data.
- Uses the
-
Deserializing the Profile:
- Converts the binary data into a
DatasetProfileView
object usingwhylogs.DatasetProfileView.deserialize
.
- Converts the binary data into a
-
Returning the Profile:
- The function returns the
reference_profile
, ready for use in your pipeline.
- The function returns the
-
WhyLabs Documentation:
- For more details on managing profiles in WhyLabs, see the WhyLabs Documentation.
-
WhyLabs Python Client:
- Learn more about the
get_reference_profile
API in the WhyLabs Python Client GitHub Documentation.
- Learn more about the
-
Valohai Environment Variables:
- Learn how to manage environment variables in Valohai from the official documentation.
from your_project_module import load_reference_data_from_whylabs
# Ensure the required environment variables are set
reference_profile = load_reference_data_from_whylabs()
print(f"Successfully loaded reference profile with timestamp: {reference_profile.dataset_timestamp}")
When running inference, use the use_whylabs_reference_profile
parameter to specify whether the reference profile is loaded from WhyLabs. You can adjust this parameter in the valohai.yaml
file or through the Valohai UI.