CLI (v2) sweep job YAML schema

[!INCLUDE cli v2]

The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/sweepJob.schema.json.

YAML syntax

Key	Type	Description	Allowed values	Default value
`$schema`	string	The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including `$schema` at the top of your file enables you to invoke schema and resource completions.
`type`	const	Required. The type of job.	`sweep`	`sweep`
`name`	string	Name of the job. Must be unique across all jobs in the workspace. If omitted, Azure ML will autogenerate a GUID for the name.
`display_name`	string	Display name of the job in the studio UI. Can be non-unique within the workspace. If omitted, Azure ML will autogenerate a human-readable adjective-noun identifier for the display name.
`experiment_name`	string	Experiment name to organize the job under. Each job's run record will be organized under the corresponding experiment in the studio's "Experiments" tab. If omitted, Azure ML will default it to the name of the working directory where the job was created.
`description`	string	Description of the job.
`tags`	object	Dictionary of tags for the job.
`sampling_algorithm`	object	Required. The hyperparameter sampling algorithm to use over the `search_space`. One of RandomSamplingAlgorithm, GridSamplingAlgorithm,or BayesianSamplingAlgorithm.
`search_space`	object	Required. Dictionary of the hyperparameter search space. The key is the name of the hyperparameter and the value is the parameter expression. Hyperparameters can be referenced in the `trial.command` using the `${{ search_space.<hyperparameter> }}` expression.
`search_space.<hyperparameter>`	object	See Parameter expressions for the set of possible expressions to use.
`objective.primary_metric`	string	Required. The name of the primary metric reported by each trial job. The metric must be logged in the user's training script using `mlflow.log_metric()` with the same corresponding metric name.
`objective.goal`	string	Required. The optimization goal of the `objective.primary_metric`.	`maximize`, `minimize`
`early_termination`	object	The early termination policy to use. A trial job is canceled when the criteria of the specified policy are met. If omitted, no early termination policy will be applied. One of BanditPolicy, MedianStoppingPolicy,or TruncationSelectionPolicy.
`limits`	object	Limits for the sweep job. See Attributes of the `limits` key.
`compute`	string	Required. Name of the compute target to execute the job on, using the `azureml:<compute_name>` syntax.
`trial`	object	Required. The job template for each trial. Each trial job will be provided with a different combination of hyperparameter values that the system samples from the `search_space`. See Attributes of the `trial` key.
`inputs`	object	Dictionary of inputs to the job. The key is a name for the input within the context of the job and the value is the input value. Inputs can be referenced in the `command` using the `${{ inputs.<input_name> }}` expression.
`inputs.<input_name>`	number, integer, boolean, string or object	One of a literal value (of type number, integer, boolean, or string) or an object containing a job input data specification.
`outputs`	object	Dictionary of output configurations of the job. The key is a name for the output within the context of the job and the value is the output configuration. Outputs can be referenced in the `command` using the `${{ outputs.<output_name> }}` expression.
`outputs.<output_name>`	object	You can leave the object empty, in which case by default the output will be of type `uri_folder` and Azure ML will system-generate an output location for the output. File(s) to the output directory will be written via read-write mount. If you want to specify a different mode for the output, provide an object containing the job output specification.
`identity`	object	The identity is used for data accessing. It can be UserIdentityConfiguration, ManagedIdentityConfiguration or None. If UserIdentityConfiguration, the identity of job submitter will be used to access input data and write result to output folder, otherwise, the managed identity of the compute target will be used.

Sampling algorithms

RandomSamplingAlgorithm

Key	Type	Description	Allowed values	Default value
`type`	const	Required. The type of sampling algorithm.	`random`
`seed`	integer	A random seed to use for initializing the random number generation. If omitted, the default seed value will be null.
`rule`	string	The type of random sampling to use. The default, `random`, will use simple uniform random sampling, while `sobol` will use the Sobol quasirandom sequence.	`random`, `sobol`	`random`

GridSamplingAlgorithm

Key	Type	Description	Allowed values
`type`	const	Required. The type of sampling algorithm.	`grid`

BayesianSamplingAlgorithm

Key	Type	Description	Allowed values
`type`	const	Required. The type of sampling algorithm.	`bayesian`

Early termination policies

BanditPolicy

Key	Type	Description	Allowed values	Default value
`type`	const	Required. The type of policy.	`bandit`
`slack_factor`	number	The ratio used to calculate the allowed distance from the best performing trial. One of `slack_factor` or `slack_amount` is required.
`slack_amount`	number	The absolute distance allowed from the best performing trial. One of `slack_factor` or `slack_amount` is required.
`evaluation_interval`	integer	The frequency for applying the policy.		`1`
`delay_evaluation`	integer	The number of intervals for which to delay the first policy evaluation. If specified, the policy applies on every multiple of `evaluation_interval` that is greater than or equal to `delay_evaluation`.		`0`

MedianStoppingPolicy

Key	Type	Description	Allowed values	Default value
`type`	const	Required. The type of policy.	`median_stopping`
`evaluation_interval`	integer	The frequency for applying the policy.		`1`
`delay_evaluation`	integer	The number of intervals for which to delay the first policy evaluation. If specified, the policy applies on every multiple of `evaluation_interval` that is greater than or equal to `delay_evaluation`.		`0`

TruncationSelectionPolicy

Key	Type	Description	Allowed values	Default value
`type`	const	Required. The type of policy.	`truncation_selection`
`truncation_percentage`	integer	Required. The percentage of trial jobs to cancel at each evaluation interval.
`evaluation_interval`	integer	The frequency for applying the policy.		`1`
`delay_evaluation`	integer	The number of intervals for which to delay the first policy evaluation. If specified, the policy applies on every multiple of `evaluation_interval` that is greater than or equal to `delay_evaluation`.		`0`

Parameter expressions

choice

Key	Type	Description	Allowed values
`type`	const	Required. The type of expression.	`choice`
`values`	array	Required. The list of discrete values to choose from.

randint

Key	Type	Description	Allowed values
`type`	const	Required. The type of expression.	`randint`
`upper`	integer	Required. The exclusive upper bound for the range of integers.

qlognormal, qnormal

Key	Type	Description	Allowed values
`type`	const	Required. The type of expression.	`qlognormal`, `qnormal`
`mu`	number	Required. The mean of the normal distribution.
`sigma`	number	Required. The standard deviation of the normal distribution.
`q`	integer	Required. The smoothing factor.

qloguniform, quniform

Key	Type	Description	Allowed values
`type`	const	Required. The type of expression.	`qloguniform`, `quniform`
`min_value`	number	Required. The minimum value in the range (inclusive).
`max_value`	number	Required. The maximum value in the range (inclusive).
`q`	integer	Required. The smoothing factor.

lognormal, normal

Key	Type	Description	Allowed values
`type`	const	Required. The type of expression.	`lognormal`, `normal`
`mu`	number	Required. The mean of the normal distribution.
`sigma`	number	Required. The standard deviation of the normal distribution.

loguniform

Key	Type	Description	Allowed values
`type`	const	Required. The type of expression.	`loguniform`
`min_value`	number	Required. The minimum value in the range will be `exp(min_value)` (inclusive).
`max_value`	number	Required. The maximum value in the range will be `exp(max_value)` (inclusive).

uniform

Key	Type	Description	Allowed values
`type`	const	Required. The type of expression.	`uniform`
`min_value`	number	Required. The minimum value in the range (inclusive).
`max_value`	number	Required. The maximum value in the range (inclusive).

Attributes of the `limits` key

Key	Type	Description	Default value
`max_total_trials`	integer	The maximum number of trial jobs.	`1000`
`max_concurrent_trials`	integer	The maximum number of trial jobs that can run concurrently.	Defaults to `max_total_trials`.
`timeout`	integer	The maximum time in seconds the entire sweep job is allowed to run. Once this limit is reached, the system will cancel the sweep job, including all its trials.	`5184000`
`trial_timeout`	integer	The maximum time in seconds each trial job is allowed to run. Once this limit is reached, the system will cancel the trial.

Attributes of the `trial` key

Key	Type	Description	Default value
`command`	string	Required. The command to execute.
`code`	string	Local path to the source code directory to be uploaded and used for the job.
`environment`	string or object	Required. The environment to use for the job. This can be either a reference to an existing versioned environment in the workspace or an inline environment specification. To reference an existing environment, use the `azureml:<environment-name>:<environment-version>` syntax. To define an environment inline, follow the Environment schema. Exclude the `name` and `version` properties as they aren't supported for inline environments.
`environment_variables`	object	Dictionary of environment variable name-value pairs to set on the process where the command is executed.
`distribution`	object	The distribution configuration for distributed training scenarios. One of MpiConfiguration, PyTorchConfiguration, or TensorFlowConfiguration.
`resources.instance_count`	integer	The number of nodes to use for the job.	`1`

Distribution configurations

MpiConfiguration

Key	Type	Description	Allowed values
`type`	const	Required. Distribution type.	`mpi`
`process_count_per_instance`	integer	Required. The number of processes per node to launch for the job.

PyTorchConfiguration

Key	Type	Description	Allowed values	Default value
`type`	const	Required. Distribution type.	`pytorch`
`process_count_per_instance`	integer	The number of processes per node to launch for the job.		`1`

TensorFlowConfiguration

Key	Type	Description	Allowed values	Default value
`type`	const	Required. Distribution type.	`tensorflow`
`worker_count`	integer	The number of workers to launch for the job.		Defaults to `resources.instance_count`.
`parameter_server_count`	integer	The number of parameter servers to launch for the job.		`0`

Job inputs

Key	Type	Description	Allowed values	Default value
`type`	string	The type of job input. Specify `uri_file` for input data that points to a single file source, or `uri_folder` for input data that points to a folder source. Learn more about data access.	`uri_file`, `uri_folder`, `mltable`, `mlflow_model`	`uri_folder`
`path`	string	The path to the data to use as input. This can be specified in a few ways: - A local path to the data source file or folder, for example, `path: ./iris.csv`. The data will get uploaded during job submission. - A URI of a cloud path to the file or folder to use as the input. Supported URI types are `azureml`, `https`, `wasbs`, `abfss`, `adl`. For more information on using the `azureml://` URI format, see Core yaml syntax. - An existing registered Azure ML data asset to use as the input. To reference a registered data asset, use the `azureml:<data_name>:<data_version>` syntax or `azureml:<data_name>@latest` (to reference the latest version of that data asset), for example, `path: azureml:cifar10-data:1` or `path: azureml:cifar10-data@latest`.
`mode`	string	Mode of how the data should be delivered to the compute target. For read-only mount (`ro_mount`), the data will be consumed as a mount path. A folder will be mounted as a folder and a file will be mounted as a file. Azure ML will resolve the input to the mount path. For `download` mode the data will be downloaded to the compute target. Azure ML will resolve the input to the downloaded path. If you only want the URL of the storage location of the data artifact(s) rather than mounting or downloading the data itself, you can use the `direct` mode. This will pass in the URL of the storage location as the job input. In this case you're fully responsible for handling credentials to access the storage.	`ro_mount`, `download`, `direct`	`ro_mount`

Job outputs

Key	Type	Description	Allowed values	Default value
`type`	string	The type of job output. For the default `uri_folder` type, the output will correspond to a folder.	`uri_file`, `uri_folder`, `mltable`, `mlflow_model`	`uri_folder`
`mode`	string	Mode of how output file(s) will get delivered to the destination storage. For read-write mount mode (`rw_mount`) the output directory will be a mounted directory. For upload mode the file(s) written will get uploaded at the end of the job.	`rw_mount`, `upload`	`rw_mount`

Identity configurations

UserIdentityConfiguration

Key	Type	Description	Allowed values
`type`	const	Required. Identity type.	`user_identity`

ManagedIdentityConfiguration

Key	Type	Description	Allowed values
`type`	const	Required. Identity type.	`managed` or `managed_identity`

Remarks

The az ml job command can be used for managing Azure Machine Learning jobs.

Examples

Examples are available in the examples GitHub repository. Several are shown below.

YAML: hello sweep

$schema: https://azuremlschemas.azureedge.net/latest/sweepJob.schema.json
type: sweep
trial:
  command: >-
    python hello-sweep.py
    --A ${{inputs.A}}
    --B ${{search_space.B}}
    --C ${{search_space.C}}
  code: src
  environment: azureml:AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest
inputs:
  A: 0.5
compute: azureml:cpu-cluster
sampling_algorithm: random
search_space:
  B:
    type: choice
    values: ["hello", "world", "hello_world"]
  C:
    type: uniform
    min_value: 0.1
    max_value: 1.0
objective:
  goal: minimize
  primary_metric: random_metric
limits:
  max_total_trials: 4
  max_concurrent_trials: 2
  timeout: 3600
display_name: hello-sweep-example
experiment_name: hello-sweep-example
description: Hello sweep job example.

YAML: basic Python model hyperparameter tuning

$schema: https://azuremlschemas.azureedge.net/latest/sweepJob.schema.json
type: sweep
trial:
  code: src
  command: >-
    python main.py 
    --iris-csv ${{inputs.iris_csv}}
    --C ${{search_space.C}}
    --kernel ${{search_space.kernel}}
    --coef0 ${{search_space.coef0}}
  environment: azureml:AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest
inputs:
  iris_csv: 
    type: uri_file
    path: wasbs://datasets@azuremlexamples.blob.core.windows.net/iris.csv
compute: azureml:cpu-cluster
sampling_algorithm: random
search_space:
  C:
    type: uniform
    min_value: 0.5
    max_value: 0.9
  kernel:
    type: choice
    values: ["rbf", "linear", "poly"]
  coef0:
    type: uniform
    min_value: 0.1
    max_value: 1
objective:
  goal: minimize
  primary_metric: training_f1_score
limits:
  max_total_trials: 20
  max_concurrent_trials: 10
  timeout: 7200
display_name: sklearn-iris-sweep-example
experiment_name: sklearn-iris-sweep-example
description: Sweep hyperparemeters for training a scikit-learn SVM on the Iris dataset.

Next steps

Install and use the CLI (v2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reference-yaml-job-sweep.md

reference-yaml-job-sweep.md

CLI (v2) sweep job YAML schema

YAML syntax

Sampling algorithms

RandomSamplingAlgorithm

GridSamplingAlgorithm

BayesianSamplingAlgorithm

Early termination policies

BanditPolicy

MedianStoppingPolicy

TruncationSelectionPolicy

Parameter expressions

choice

randint

qlognormal, qnormal

qloguniform, quniform

lognormal, normal

loguniform

uniform

Attributes of the `limits` key

Attributes of the `trial` key

Distribution configurations

MpiConfiguration

PyTorchConfiguration

TensorFlowConfiguration

Job inputs

Job outputs

Identity configurations

UserIdentityConfiguration

ManagedIdentityConfiguration

Remarks

Examples

YAML: hello sweep

YAML: basic Python model hyperparameter tuning

Next steps

Files

reference-yaml-job-sweep.md

Latest commit

History

reference-yaml-job-sweep.md

File metadata and controls

CLI (v2) sweep job YAML schema

YAML syntax

Sampling algorithms

RandomSamplingAlgorithm

GridSamplingAlgorithm

BayesianSamplingAlgorithm

Early termination policies

BanditPolicy

MedianStoppingPolicy

TruncationSelectionPolicy

Parameter expressions

choice

randint

qlognormal, qnormal

qloguniform, quniform

lognormal, normal

loguniform

uniform

Attributes of the limits key

Attributes of the trial key

Distribution configurations

MpiConfiguration

PyTorchConfiguration

TensorFlowConfiguration

Job inputs

Job outputs

Identity configurations

UserIdentityConfiguration

ManagedIdentityConfiguration

Remarks

Examples

YAML: hello sweep

YAML: basic Python model hyperparameter tuning

Next steps

Attributes of the `limits` key

Attributes of the `trial` key