-
Notifications
You must be signed in to change notification settings - Fork 141
Conversation
"input_datareference_name": "normalized_data_conditioned", | ||
"input_dataset_name": "normalizeddataconditioned", | ||
"source_directory": "train/", | ||
"arguments": ["--splits", "splits", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the example mention how the input\output params could be be passed in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It adds more detail to the params around line 111. Do you mean more than that?
…seismic-deeplearning into azureml-train-pipelin
@@ -0,0 +1,13 @@ | |||
git+https://github.com/olgaliak/seismic-deeplearning.git@azureml-train-pipeline#egg=cv_lib&subdirectory=cv_lib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please pull from /microsoft. Also is there a way to specify master branch here? We're on staging by default which might break one of these days
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately doing that would require the init.py file that I added so this wouldn't work until that file is added to /microsoft and /master (there is a way to specify the master branch). I have a note about changing it to microsoft/staging before the merge in the PR comment. Just leaving for now so it can be tested
@@ -0,0 +1,13 @@ | |||
git+https://github.com/olgaliak/seismic-deeplearning.git@azureml-train-pipeline#egg=cv_lib&subdirectory=cv_lib | |||
git+https://github.com/microsoft/seismic-deeplearning.git#egg=deepseismic-interpretation&subdirectory=interpretation | |||
opencv-python==4.1.2.30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use Pillow for your work instead of openCV?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are only using OpenCV for the boarder_contant feature to pad images. This is in the train.py that we are leveraging and in other areas of the code. Are you recommending to substitute this for similar functionality in Pillow?
@@ -0,0 +1,175 @@ | |||
# Integrating with AzureML |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also add some verbiage in the main README and point to this file? AML team would love this!
AML_COMPUTE_CLUSTER_MIN_NODES | ||
AML_COMPUTE_CLUSTER_MAX_NODES | ||
AML_COMPUTE_CLUSTER_SKU | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use https://pypi.org/project/python-dotenv/ in the code and then ask the user to create a .env file where these will reside. We don't want to set these in the notebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does use python-dotenv to grab the variables. This is just the readme instructing them to set those variables in any way they choose (.env file for vscode is mentioned)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@annazietlow The DeepSeismic team has requested the PR to go to a contrib branch so I've closed this one and we are continuing the conversations on that PR #195
AML_COMPUTE_CLUSTER_SKU | ||
``` | ||
On Windows you can use: | ||
`set VARIABLE=value` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't support Windows :-p please feel free to get rid of this.
[Still requires documentation about running with dutchf3 data specifically.]
This PR adds configurable Azure ML training pipelines to Seismic DeepLearning. Each pipeline is configured via the pipeline json and environment variables. See the README added in this PR (azureml_pipelines folder) for more information.
A few arguments had to be added to train.py: input and output. These allow the pipeline to pass in mounted Blob storage locations to be used as a regular file system.
azureml_requirements.txt file was added to the training directory. This file holds all dependencies for train.py so they can be installed on the compute in Azure ML.
An empty init.py was added to the segmentation/dutchf3 module. I was unable to install cv_lib and access cv_lib.segmentation.dutchf3.utils without this when installing from github (note: to get this version to work, I am referencing this feature branch in the requirements. It should be changed to microsoft:staging before merging)
There are two scripts added: kickoff_train_pipeline.py & cancel_run.py. Kickoff_train shows how to run an Azure ML train pipeline & cancel_run is a development script for cancelling the run instead of letting it take resources to completion.
The train_pipeline inherits from base_pipeline which would be a helpful abstraction for future addition of an inference pipeline
Integration tests for kicking off a train pipeline are added. These are not currently run in a build pipeline because they require communication with an Azure ML Service instance
Requirements for kicking off the pipeline are added to interpretation's requirements.txt
To Test: