Skip to content

Commit

Permalink
Pmanoj/read model specific defaults (#2442)
Browse files Browse the repository at this point in the history
* reading the model specific defaults from model card

* updating the metric defaults for the tasks

* updating the defaults from bool -> string

* fixing formatting issues
  • Loading branch information
jpmann authored Jul 11, 2023
1 parent 738af5c commit c9eefae
Show file tree
Hide file tree
Showing 5 changed files with 255 additions and 45 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@
" workspace_name=\"<WORKSPACE_NAME>\",\n",
" )\n",
"\n",
"# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml-preview\"\n",
"# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml\"\n",
"registry_ml_client = MLClient(credential, registry_name=\"azureml\")\n",
"\n",
"experiment_name = \"question-answering-extractive-qna\"\n",
Expand Down Expand Up @@ -344,6 +344,54 @@
"Create the job that uses the `question-answering` pipeline component. [Learn more](https://github.com/Azure/azureml-assets/blob/main/training/finetune_acft_hf_nlp/components/pipeline_components/question_answering/README.md) about all the parameters supported for fine tuning."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define finetune parameters\n",
"\n",
"Finetune parameters can be grouped into 2 categories - training parameters, optimization parameters\n",
"\n",
"Training parameters define the training aspects such as - \n",
"1. the optimizer, scheduler to use\n",
"2. the metric to optimize the finetune\n",
"3. number of training steps and the batch size\n",
"and so on\n",
"\n",
"Optimization parameters help in optimizing the GPU memory and effectively using the compute resources. Below are few of the parameters that belong to this category. _The optimization parameters differs for each model and are packaged with the model to handle these variations._\n",
"1. enable the deepspeed, ORT and LoRA\n",
"2. enable mixed precision training\n",
"2. enable multi-node training "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Training parameters\n",
"training_parameters = dict(\n",
" num_train_epochs=3,\n",
" per_device_train_batch_size=1,\n",
" per_device_eval_batch_size=1,\n",
" learning_rate=2e-5,\n",
" metric_for_best_model=\"exact\",\n",
")\n",
"print(f\"The following training parameters are enabled - {training_parameters}\")\n",
"\n",
"# Optimization parameters - As these parameters are packaged with the model itself, lets retrieve those parameters\n",
"if \"model_specific_defaults\" in foundation_model.tags:\n",
" optimization_parameters = ast.literal_eval(\n",
" foundation_model.tags[\"model_specific_defaults\"]\n",
" ) # convert string to python dict\n",
"else:\n",
" optimization_parameters = dict(\n",
" apply_lora=\"true\", apply_deepspeed=\"true\", apply_ort=\"true\"\n",
" )\n",
"print(f\"The following optimizations are enabled - {optimization_parameters}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -400,14 +448,8 @@
" answer_text_key=\"text\",\n",
" # training settings\n",
" number_of_gpu_to_use_finetuning=gpus_per_node, # set to the number of GPUs available in the compute\n",
" num_train_epochs=3,\n",
" per_device_train_batch_size=1,\n",
" per_device_eval_batch_size=1,\n",
" learning_rate=2e-5,\n",
" metric_for_best_model=\"exact\",\n",
" apply_lora=\"true\",\n",
" apply_deepspeed=\"true\",\n",
" apply_ort=\"true\",\n",
" **training_parameters,\n",
" **optimization_parameters\n",
" )\n",
" return {\n",
" # map the output of the fine tuning job to the output of the pipeline job so that we can easily register the fine tuned model\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@
" workspace_name=\"<WORKSPACE_NAME>\",\n",
" )\n",
"\n",
"# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml-preview\"\n",
"# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml\"\n",
"registry_ml_client = MLClient(credential, registry_name=\"azureml\")\n",
"\n",
"experiment_name = \"summarization-news-summary\"\n",
Expand Down Expand Up @@ -349,6 +349,54 @@
"Create the job that uses the `summarization` pipeline component. [Learn more](https://github.com/Azure/azureml-assets/blob/main/training/finetune_acft_hf_nlp/components/pipeline_components/summarization/README.md) about all the parameters supported for fine tuning."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define finetune parameters\n",
"\n",
"Finetune parameters can be grouped into 2 categories - training parameters, optimization parameters\n",
"\n",
"Training parameters define the training aspects such as - \n",
"1. the optimizer, scheduler to use\n",
"2. the metric to optimize the finetune\n",
"3. number of training steps and the batch size\n",
"and so on\n",
"\n",
"Optimization parameters help in optimizing the GPU memory and effectively using the compute resources. Below are few of the parameters that belong to this category. _The optimization parameters differs for each model and are packaged with the model to handle these variations._\n",
"1. enable the deepspeed, ORT and LoRA\n",
"2. enable mixed precision training\n",
"2. enable multi-node training "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Training parameters\n",
"training_parameters = dict(\n",
" num_train_epochs=3,\n",
" per_device_train_batch_size=1,\n",
" per_device_eval_batch_size=1,\n",
" learning_rate=2e-5,\n",
" metric_for_best_model=\"rouge1\",\n",
")\n",
"print(f\"The following training parameters are enabled - {training_parameters}\")\n",
"\n",
"# Optimization parameters - As these parameters are packaged with the model itself, lets retrieve those parameters\n",
"if \"model_specific_defaults\" in foundation_model.tags:\n",
" optimization_parameters = ast.literal_eval(\n",
" foundation_model.tags[\"model_specific_defaults\"]\n",
" ) # convert string to python dict\n",
"else:\n",
" optimization_parameters = dict(\n",
" apply_lora=\"true\", apply_deepspeed=\"true\", apply_ort=\"true\"\n",
" )\n",
"print(f\"The following optimizations are enabled - {optimization_parameters}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -394,14 +442,8 @@
" summary_key=\"highlights\",\n",
" # training settings\n",
" number_of_gpu_to_use_finetuning=gpus_per_node, # set to the number of GPUs available in the compute\n",
" num_train_epochs=3,\n",
" per_device_train_batch_size=1,\n",
" per_device_eval_batch_size=1,\n",
" learning_rate=2e-5,\n",
" metric_for_best_model=\"rouge1\",\n",
" apply_deepspeed=\"true\",\n",
" apply_ort=\"true\",\n",
" apply_lora=\"true\",\n",
" **training_parameters,\n",
" **optimization_parameters\n",
" )\n",
" return {\n",
" # map the output of the fine tuning job to the output of the pipeline job so that we can easily register the fine tuned model\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@
" workspace_name=\"<WORKSPACE_NAME>\",\n",
" )\n",
"\n",
"# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml-preview\"\n",
"# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml\"\n",
"registry_ml_client = MLClient(credential, registry_name=\"azureml\")\n",
"\n",
"experiment_name = \"text-classification-emotion-detection\"\n",
Expand Down Expand Up @@ -386,6 +386,54 @@
"Create the job that uses the `text-classification` pipeline component. [Learn more](https://github.com/Azure/azureml-assets/blob/main/training/finetune_acft_hf_nlp/components/pipeline_components/text_classification/README.md) about all the parameters supported for fine tuning."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define finetune parameters\n",
"\n",
"Finetune parameters can be grouped into 2 categories - training parameters, optimization parameters\n",
"\n",
"Training parameters define the training aspects such as - \n",
"1. the optimizer, scheduler to use\n",
"2. the metric to optimize the finetune\n",
"3. number of training steps and the batch size\n",
"and so on\n",
"\n",
"Optimization parameters help in optimizing the GPU memory and effectively using the compute resources. Below are few of the parameters that belong to this category. _The optimization parameters differs for each model and are packaged with the model to handle these variations._\n",
"1. enable the deepspeed, ORT and LoRA\n",
"2. enable mixed precision training\n",
"2. enable multi-node training "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Training parameters\n",
"training_parameters = dict(\n",
" num_train_epochs=3,\n",
" per_device_train_batch_size=1,\n",
" per_device_eval_batch_size=1,\n",
" learning_rate=2e-5,\n",
" metric_for_best_model=\"f1_macro\",\n",
")\n",
"print(f\"The following training parameters are enabled - {training_parameters}\")\n",
"\n",
"# Optimization parameters - As these parameters are packaged with the model itself, lets retrieve those parameters\n",
"if \"model_specific_defaults\" in foundation_model.tags:\n",
" optimization_parameters = ast.literal_eval(\n",
" foundation_model.tags[\"model_specific_defaults\"]\n",
" ) # convert string to python dict\n",
"else:\n",
" optimization_parameters = dict(\n",
" apply_lora=\"true\", apply_deepspeed=\"true\", apply_ort=\"true\"\n",
" )\n",
"print(f\"The following optimizations are enabled - {optimization_parameters}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -431,14 +479,8 @@
" label_key=\"label_string\",\n",
" # Training settings\n",
" number_of_gpu_to_use_finetuning=gpus_per_node, # set to the number of GPUs available in the compute\n",
" num_train_epochs=3,\n",
" per_device_train_batch_size=1,\n",
" per_device_eval_batch_size=1,\n",
" learning_rate=2e-5,\n",
" metric_for_best_model=\"f1_macro\",\n",
" apply_deepspeed=\"true\",\n",
" apply_lora=\"true\",\n",
" apply_ort=\"true\",\n",
" **training_parameters,\n",
" **optimization_parameters\n",
" )\n",
" return {\n",
" # map the output of the fine tuning job to the output of pipeline job so that we can easily register the fine tuned model\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@
" workspace_name=\"<WORKSPACE_NAME>\",\n",
" )\n",
"\n",
"# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml-preview\"\n",
"# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml\"\n",
"registry_ml_client = MLClient(credential, registry_name=\"azureml\")\n",
"\n",
"experiment_name = \"token-classification-ner\"\n",
Expand Down Expand Up @@ -352,6 +352,54 @@
"Create the job that uses the `token-classification` pipeline component. [Learn more](https://github.com/Azure/azureml-assets/blob/main/training/finetune_acft_hf_nlp/components/pipeline_components/token_classification/README.md) about all the parameters supported for fine tuning."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define finetune parameters\n",
"\n",
"Finetune parameters can be grouped into 2 categories - training parameters, optimization parameters\n",
"\n",
"Training parameters define the training aspects such as - \n",
"1. the optimizer, scheduler to use\n",
"2. the metric to optimize the finetune\n",
"3. number of training steps and the batch size\n",
"and so on\n",
"\n",
"Optimization parameters help in optimizing the GPU memory and effectively using the compute resources. Below are few of the parameters that belong to this category. _The optimization parameters differs for each model and are packaged with the model to handle these variations._\n",
"1. enable the deepspeed, ORT and LoRA\n",
"2. enable mixed precision training\n",
"2. enable multi-node training "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Training parameters\n",
"training_parameters = dict(\n",
" num_train_epochs=3,\n",
" per_device_train_batch_size=1,\n",
" per_device_eval_batch_size=1,\n",
" learning_rate=2e-5,\n",
" metric_for_best_model=\"f1\",\n",
")\n",
"print(f\"The following training parameters are enabled - {training_parameters}\")\n",
"\n",
"# Optimization parameters - As these parameters are packaged with the model itself, lets retrieve those parameters\n",
"if \"model_specific_defaults\" in foundation_model.tags:\n",
" optimization_parameters = ast.literal_eval(\n",
" foundation_model.tags[\"model_specific_defaults\"]\n",
" ) # convert string to python dict\n",
"else:\n",
" optimization_parameters = dict(\n",
" apply_lora=\"true\", apply_deepspeed=\"true\", apply_ort=\"true\"\n",
" )\n",
"print(f\"The following optimizations are enabled - {optimization_parameters}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -397,14 +445,8 @@
" tag_key=\"ner_tags_str\",\n",
" # Training settings\n",
" number_of_gpu_to_use_finetuning=gpus_per_node, # set to the number of GPUs available in the compute\n",
" num_train_epochs=3,\n",
" per_device_train_batch_size=1,\n",
" per_device_eval_batch_size=1,\n",
" learning_rate=2e-5,\n",
" metric_for_best_model=\"f1\",\n",
" apply_lora=\"true\",\n",
" apply_ort=\"true\",\n",
" apply_deepspeed=\"true\",\n",
" **training_parameters,\n",
" **optimization_parameters\n",
" )\n",
" return {\n",
" # map the output of the fine tuning job to the output of pipeline job so that we can easily register the fine tuned model\n",
Expand Down
Loading

0 comments on commit c9eefae

Please sign in to comment.