Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Azureml train pipeline #186

Closed
wants to merge 14 commits into from

Conversation

annazietlow
Copy link
Collaborator

@annazietlow annazietlow commented Feb 7, 2020

[Still requires documentation about running with dutchf3 data specifically.]

This PR adds configurable Azure ML training pipelines to Seismic DeepLearning. Each pipeline is configured via the pipeline json and environment variables. See the README added in this PR (azureml_pipelines folder) for more information.

  • A few arguments had to be added to train.py: input and output. These allow the pipeline to pass in mounted Blob storage locations to be used as a regular file system.

  • azureml_requirements.txt file was added to the training directory. This file holds all dependencies for train.py so they can be installed on the compute in Azure ML.

  • An empty init.py was added to the segmentation/dutchf3 module. I was unable to install cv_lib and access cv_lib.segmentation.dutchf3.utils without this when installing from github (note: to get this version to work, I am referencing this feature branch in the requirements. It should be changed to microsoft:staging before merging)

  • There are two scripts added: kickoff_train_pipeline.py & cancel_run.py. Kickoff_train shows how to run an Azure ML train pipeline & cancel_run is a development script for cancelling the run instead of letting it take resources to completion.

  • The train_pipeline inherits from base_pipeline which would be a helpful abstraction for future addition of an inference pipeline

  • Integration tests for kicking off a train pipeline are added. These are not currently run in a build pipeline because they require communication with an Azure ML Service instance

  • Requirements for kicking off the pipeline are added to interpretation's requirements.txt

To Test:

  1. Prepare dutchf3 data and upload to blob storage
  2. Create an instance of Azure ML service
  3. Follow instructions for setting up the environment variables and kicking off a pipeline in the pipelines README
  4. Verify that training runs successfully (currently taking ~2 hours for 1 epoch)

experiments/interpretation/dutchf3_patch/local/train.py Outdated Show resolved Hide resolved
"input_datareference_name": "normalized_data_conditioned",
"input_dataset_name": "normalizeddataconditioned",
"source_directory": "train/",
"arguments": ["--splits", "splits",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the example mention how the input\output params could be be passed in?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It adds more detail to the params around line 111. Do you mean more than that?

@@ -0,0 +1,13 @@
git+https://github.com/olgaliak/seismic-deeplearning.git@azureml-train-pipeline#egg=cv_lib&subdirectory=cv_lib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please pull from /microsoft. Also is there a way to specify master branch here? We're on staging by default which might break one of these days

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately doing that would require the init.py file that I added so this wouldn't work until that file is added to /microsoft and /master (there is a way to specify the master branch). I have a note about changing it to microsoft/staging before the merge in the PR comment. Just leaving for now so it can be tested

@@ -0,0 +1,13 @@
git+https://github.com/olgaliak/seismic-deeplearning.git@azureml-train-pipeline#egg=cv_lib&subdirectory=cv_lib
git+https://github.com/microsoft/seismic-deeplearning.git#egg=deepseismic-interpretation&subdirectory=interpretation
opencv-python==4.1.2.30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use Pillow for your work instead of openCV?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are only using OpenCV for the boarder_contant feature to pad images. This is in the train.py that we are leveraging and in other areas of the code. Are you recommending to substitute this for similar functionality in Pillow?

@@ -0,0 +1,175 @@
# Integrating with AzureML
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also add some verbiage in the main README and point to this file? AML team would love this!

AML_COMPUTE_CLUSTER_MIN_NODES
AML_COMPUTE_CLUSTER_MAX_NODES
AML_COMPUTE_CLUSTER_SKU
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use https://pypi.org/project/python-dotenv/ in the code and then ask the user to create a .env file where these will reside. We don't want to set these in the notebook

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does use python-dotenv to grab the variables. This is just the readme instructing them to set those variables in any way they choose (.env file for vscode is mentioned)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@annazietlow The DeepSeismic team has requested the PR to go to a contrib branch so I've closed this one and we are continuing the conversations on that PR #195

AML_COMPUTE_CLUSTER_SKU
```
On Windows you can use:
`set VARIABLE=value`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support Windows :-p please feel free to get rid of this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants