Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

The colormap used in mask_to_disk does not support more than 12 classes, will leads to incorrect results for datasets with more than 12 classes #324

Closed
yalaudah opened this issue May 27, 2020 · 0 comments · Fixed by #338
Assignees
Labels
Prior: High Type: Bug Something isn't working Type: Enhancement This an enhancement to an existing feature

Comments

@yalaudah
Copy link
Contributor

In the file cv_lib/cv_lib/utils.py, the Paired colormap used in mask_to_disk does not support more than 12 classes.

def mask_to_disk(mask, fname, cmap_name="Paired"):
    """
    write segmentation mask to disk using a particular colormap
    """
    cmap = plt.get_cmap(cmap_name)
    Image.fromarray(cmap(normalize(mask), bytes=True)).save(fname)

This must be fixed to support an arbitrary number of classes.

@yalaudah yalaudah changed the title The colormap used in mask_to_disk does not support more than 12 classes. The colormap used in mask_to_disk does not support more than 12 classes, will leads to incorrect results for datasets with more than 12 classes May 28, 2020
@yalaudah yalaudah added this to the V0.1.3 [BYOD] milestone May 28, 2020
@yalaudah yalaudah added the Type: Bug Something isn't working label May 28, 2020
@maxkazmsft maxkazmsft added the Type: Enhancement This an enhancement to an existing feature label May 28, 2020
@yalaudah yalaudah added the Status: In-progress Active on-going work in the sprint label Jun 1, 2020
@yalaudah yalaudah self-assigned this Jun 1, 2020
@yalaudah yalaudah linked a pull request Jun 1, 2020 that will close this issue
yalaudah pushed a commit that referenced this issue Jun 3, 2020
* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout
@yalaudah yalaudah removed the Status: In-progress Active on-going work in the sprint label Jun 3, 2020
@maxkazmsft maxkazmsft modified the milestones: V0.1.4 [BYOD], V0.1.3 Jun 4, 2020
maxkazmsft added a commit that referenced this issue Jun 8, 2020
* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>
maxkazmsft added a commit that referenced this issue Jul 7, 2020
* V00.01.00003 release (#356)

* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

* typos

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>
maxkazmsft added a commit that referenced this issue Jul 7, 2020
* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

* PR to fix #342 (#347)

* intermediate work for normalization

* 1) normalize function runs based on global MIN and MAX 2) has a error handling for division by zero, np.finfo 3) decode_segmap normalizes the label/mask based on the n_calsses

* global normalization added to test.py

* increasing the threshold on timeout

* trigger

* revert

* idk what happened

* increase timeout

* picking up global min and max

* passing config to TrainPatchLoader to facilitate access to global min and max and other attr in low level functions, WIP

* removed print statement

* changed section loaders

* updated test for min and max from config too

* adde MIN and MAX to config

* notebook modified for loaders

* another dataloader in notebook

* readme update

* changed the default values for min max, updated the docstring for loaders, removed suppressed lines

* debug

* merging work from CSE team into main staging branch (#357)

* Adding content to interpretation README (#171)

* added sharat, weehyong to authors

* adding a download script for Dutch F3 dataset

* Adding script instructions for dutch f3

* Update README.md

prepare scripts expect root level directory for dutch f3 dataset. (it is downloaded into $dir/data by the script)

* Adding readme text for the notebooks and checking if config is correctly setup

* fixing prepare script example

* Adding more content to interpretation README

* Update README.md

* Update HRNet_Penobscot_demo_notebook.ipynb

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* Updates to prepare dutchf3 (#185)

* updating patch to patch_size when we are using it as an integer

* modifying the range function in the prepare_dutchf3 script to get all of our data

* updating path to logging.config so the script can locate it

* manually reverting back log path to troubleshoot build tests

* updating patch to patch_size for testing on preprocessing scripts

* updating patch to patch_size where applicable in ablation.sh

* reverting back changes on ablation.sh to validate build pass

* update patch to patch_size in ablation.sh (#191)

Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>

* TestLoader's support for custom paths (#196)

* Add testloader support for custom paths.

* Add test

* added file name workaround for Train*Loader classes

* adding comments and clean up

* Remove legacy code.

* Remove parameters that dont exist in init() from documentation.

* Add unit tests for data loaders in dutchf3

* moved unit tests

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* select contiguous data splits for val and train (#200)

* select contiguous data splits for test and train

* changed data-dir to data_dir as arg to prepare_dutchf3.py

* update script with new required parameter label_file

* ignoring split_alaudah_et_al_19 as it is not updated

* changed TEST to VALIDATION for clarity in the code

* included job to run scripts unit test

* Fix val/train split and add tests

* adjust to consider the whole horz_lines

* update environment - gitpython version

* Segy Converter Utility (#199)

* Add convert_segy utility script and related notebooks

* add segy files to .gitignore

* readability update

* Create methods for normalizing and clipping separately.

* Add comment

* update file paths

* cleanup tests and terminology for the normalization/clipping code

* update notes to provide more context for using the script

* Add tests for clipping.

* Update comments

* added Microsoft copyright

* Update root README

* Add a flag to turn on clipping in dataprep script.

* Remove hard coded values and fix _filder_data method.

* Fix some minor issues pointed out on comments.

* Remove unused lib.

* Rename notebooks to impose order; set env; move all def funtions into utils; improve comments in notebooks; and include code example to run prepare_dutchf3.py

* Label missing data with 255.

* Remove cell with --help command.

* Add notebooks to test pipeline.

* grammer edits

* update notebook output and utils naming

* fix output dir error and cleanup notebook

* fix yaml indent error in notebooks_build.yml

* fix merge issues and job name errors

* debugging the build pipeline

* combine notebook tests for segy converter since they are dependent on each other

Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>

* Azureml train pipeline (#195)

* initial add of azure ml pipeline

* update references and dependencies

* fix integration tests

* remove incomplete tests

* add azureml requirements.txt for dutchf3 local patch and update pipeline config

* add empty __init__.py to cv_lib dutchf3

* Get train,py to run in pipeline

* allow output dir in train.py

* Clean up README and __init__

* only pass output if available and use input dir for output in train.py

* update comment in train.py

* updating azureml_requirements to only pull from /master

* removing windows guidance in azureml_pipelines/README.md

* adding .env.example

* adding azureml config example

* updating documentation in azureml_pipelines README.md

* updating main README.md to refer to AML guidance documentation

* updating AML README.md to include additional guidance to cancel runs

* adding documentation on AzureML pipelines in the AML README.me

* adding files needed section for AML training run

* including hyperlink in format poiniting to additional detail on Azure Machine Learning pipeslines in AML README.md

* removing the mention of VSCode in the AML README.md

* fixing typo

* modifying config to pipeline configuration in README.md

* fixing typo in README.md

* adding documentation on how to create a blob container and copy data onto it

* adding documentation on blob storage guidance

* adding guidance on how to get the subscription id

* adding guidance to activate environment and then run the kick off train pipeline from ROOT

* adding ability to pass in experiement name and different pipeline configuration to kickoff_train_pipeline.py

* adding Microsoft Corporation Copyright to kickoff_train_pipeline.py

* fixing format in README.md

* adding trouble shooting section in README.md for connection to subscription

* updating troubleshooting title

* adding guidance on how to download the config.json from the Azure Portal in the README.md

* adding additional guidance and information on AzureML compute targets and naming conventions

* changing the configuation file example to only include the train step that is currently supported

* updating config to pipeline configuration when applicable

* adding link to Microsoft docs for additional information on pipeline steps

* updated AML test build definitions

* updated AML test build definitions

* adding job to aml_build.yml

* updating example config for testing

* modifying the test_train_pipeline.py to have appropriate number of pipeline steps and other required modifications

* updating AML_pipeline_tests in aml_build.yml to consume environment variables

* updating scriptType, sciptLocation, and inlineScript in aml_build.yml

* trivial commit to re-trigger broken build pipelines

* fix to aml yml build to use env vars for secrets and everything else

* another yml fix

* another yml fix

* reverting structure format of jobs for aml_build pipeline tests

* updating path to test_train_pipeline.py

* aml_pipeline_tests timed out, extending timeoutInMinutes from 10 to 40

* adding additional pytest

* adding az login

* updating variables in aml pipeline tests

Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>
Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* moved contrib contributions around from CSE

* fixed dataloader tests - updated them to work with new code from staging branch

* segyconverter notebooks and tests run and pass; updated documentation

* added test job for segy converter notebooks

* removed AML training pipeline from this release

* fixed training model tolerance precision in the tests - wasn't working

* fixed train.py build issues after the merge

* addressed PR comments

* fixed bug in check_performance

Co-authored-by: Sharat Chikkerur <sharat.chikkerur@microsoft.com>
Co-authored-by: kirasoderstrom <kirasoderstrom@gmail.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>
Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>
Co-authored-by: Ricardo Squassina Lee <8495707+squassina@users.noreply.github.com>
Co-authored-by: Michael Zawacki <mikezawacki@hotmail.com>
Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>

* make tests simpler (#368)

* removed Dutch F3 job from main_build

* fixed a bug in data subset in debug mode

* modified epoch numbers to pass the performance checks, checkedout check_performance from Max's branch

* modified get_data_for_builds.sh to set up checkerboard data for smaller size, minor improvements on gen_checkerboard

* send all the batches, disabled the performance checks for patch_deconvnet

* added comment to enable tests for patch_deconvnet after debugging, renamed gen_checkerboard, added options to new arg per Max's suggestion

* Replace HRNet with SEResNet model in the notebook (#362)

* replaced HRNet with SEResNet model in the notebook

* removed debugging cell info

* fixed bug where resnet_unet model wasn't loading the pre-trained version in the notebook

* fixed build VM problems

* Multi-GPU training support (#359)

* Data flow tests (#375)

* renamed checkerboard job name

* restructured default outputs from test.py to be dumped under output dir and not debug dir

* test.py output re-org

* removed outdated variable from check_performance.py

* intermediate work

* intermediate work

* bunch of intermediate works

* changing args for different trainings

* final to run dev_build"

* remove print statements

* removed print statement

* removed suppressed lines

* added assertion error msg

* added assertion error msg, one intential bug to test

* testing a stupid bug

* debug

* omg

* final

* trigger build

* fixed multi-GPU termination in train.py (#379)

* PR to fix #371 and #372  (#380)

* added learning rate to logs

* changed epoch for patch_deconvnet, and enabled the tests

* removed TODOs

* changed tensorflow pinned version (#387)

* changed tensorflow pinned version

* trigger build

* closes 385 (#389)

* Fixing #259 by adding symmetric padding along depth direction  (#386)

* BYOD Penobscot (#390)

* minor updates to files

* added penobscot conversion code

* docker build test (#388)

* added a new job to test bulding the docker, for now it is daisy-chained to the end

* this is just a TEST

* test

* test

* remove old image

* debug

* debug

* test

* debug

* enabled all the jobs

* quick fix

* removing non-tagged iamges

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* added missing license headers and fixed formatting (#391)

* added missing license headers and fixed formatting

* some more license headers

* updated documentation to close 354 and 381 (#392)

* fix test.py and notebook issues (#394)

* resolved conflicts for 0.2 release (#396)

* V00.01.00003 release (#356)

* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

* typos

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@microsoft.com>
Co-authored-by: kirasoderstrom <kirasoderstrom@gmail.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>
Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>
Co-authored-by: Ricardo Squassina Lee <8495707+squassina@users.noreply.github.com>
Co-authored-by: Michael Zawacki <mikezawacki@hotmail.com>
Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>
maxkazmsft added a commit that referenced this issue Jul 22, 2020
* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

* PR to fix #342 (#347)

* intermediate work for normalization

* 1) normalize function runs based on global MIN and MAX 2) has a error handling for division by zero, np.finfo 3) decode_segmap normalizes the label/mask based on the n_calsses

* global normalization added to test.py

* increasing the threshold on timeout

* trigger

* revert

* idk what happened

* increase timeout

* picking up global min and max

* passing config to TrainPatchLoader to facilitate access to global min and max and other attr in low level functions, WIP

* removed print statement

* changed section loaders

* updated test for min and max from config too

* adde MIN and MAX to config

* notebook modified for loaders

* another dataloader in notebook

* readme update

* changed the default values for min max, updated the docstring for loaders, removed suppressed lines

* debug

* merging work from CSE team into main staging branch (#357)

* Adding content to interpretation README (#171)

* added sharat, weehyong to authors

* adding a download script for Dutch F3 dataset

* Adding script instructions for dutch f3

* Update README.md

prepare scripts expect root level directory for dutch f3 dataset. (it is downloaded into $dir/data by the script)

* Adding readme text for the notebooks and checking if config is correctly setup

* fixing prepare script example

* Adding more content to interpretation README

* Update README.md

* Update HRNet_Penobscot_demo_notebook.ipynb

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* Updates to prepare dutchf3 (#185)

* updating patch to patch_size when we are using it as an integer

* modifying the range function in the prepare_dutchf3 script to get all of our data

* updating path to logging.config so the script can locate it

* manually reverting back log path to troubleshoot build tests

* updating patch to patch_size for testing on preprocessing scripts

* updating patch to patch_size where applicable in ablation.sh

* reverting back changes on ablation.sh to validate build pass

* update patch to patch_size in ablation.sh (#191)

Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>

* TestLoader's support for custom paths (#196)

* Add testloader support for custom paths.

* Add test

* added file name workaround for Train*Loader classes

* adding comments and clean up

* Remove legacy code.

* Remove parameters that dont exist in init() from documentation.

* Add unit tests for data loaders in dutchf3

* moved unit tests

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* select contiguous data splits for val and train (#200)

* select contiguous data splits for test and train

* changed data-dir to data_dir as arg to prepare_dutchf3.py

* update script with new required parameter label_file

* ignoring split_alaudah_et_al_19 as it is not updated

* changed TEST to VALIDATION for clarity in the code

* included job to run scripts unit test

* Fix val/train split and add tests

* adjust to consider the whole horz_lines

* update environment - gitpython version

* Segy Converter Utility (#199)

* Add convert_segy utility script and related notebooks

* add segy files to .gitignore

* readability update

* Create methods for normalizing and clipping separately.

* Add comment

* update file paths

* cleanup tests and terminology for the normalization/clipping code

* update notes to provide more context for using the script

* Add tests for clipping.

* Update comments

* added Microsoft copyright

* Update root README

* Add a flag to turn on clipping in dataprep script.

* Remove hard coded values and fix _filder_data method.

* Fix some minor issues pointed out on comments.

* Remove unused lib.

* Rename notebooks to impose order; set env; move all def funtions into utils; improve comments in notebooks; and include code example to run prepare_dutchf3.py

* Label missing data with 255.

* Remove cell with --help command.

* Add notebooks to test pipeline.

* grammer edits

* update notebook output and utils naming

* fix output dir error and cleanup notebook

* fix yaml indent error in notebooks_build.yml

* fix merge issues and job name errors

* debugging the build pipeline

* combine notebook tests for segy converter since they are dependent on each other

Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>

* Azureml train pipeline (#195)

* initial add of azure ml pipeline

* update references and dependencies

* fix integration tests

* remove incomplete tests

* add azureml requirements.txt for dutchf3 local patch and update pipeline config

* add empty __init__.py to cv_lib dutchf3

* Get train,py to run in pipeline

* allow output dir in train.py

* Clean up README and __init__

* only pass output if available and use input dir for output in train.py

* update comment in train.py

* updating azureml_requirements to only pull from /master

* removing windows guidance in azureml_pipelines/README.md

* adding .env.example

* adding azureml config example

* updating documentation in azureml_pipelines README.md

* updating main README.md to refer to AML guidance documentation

* updating AML README.md to include additional guidance to cancel runs

* adding documentation on AzureML pipelines in the AML README.me

* adding files needed section for AML training run

* including hyperlink in format poiniting to additional detail on Azure Machine Learning pipeslines in AML README.md

* removing the mention of VSCode in the AML README.md

* fixing typo

* modifying config to pipeline configuration in README.md

* fixing typo in README.md

* adding documentation on how to create a blob container and copy data onto it

* adding documentation on blob storage guidance

* adding guidance on how to get the subscription id

* adding guidance to activate environment and then run the kick off train pipeline from ROOT

* adding ability to pass in experiement name and different pipeline configuration to kickoff_train_pipeline.py

* adding Microsoft Corporation Copyright to kickoff_train_pipeline.py

* fixing format in README.md

* adding trouble shooting section in README.md for connection to subscription

* updating troubleshooting title

* adding guidance on how to download the config.json from the Azure Portal in the README.md

* adding additional guidance and information on AzureML compute targets and naming conventions

* changing the configuation file example to only include the train step that is currently supported

* updating config to pipeline configuration when applicable

* adding link to Microsoft docs for additional information on pipeline steps

* updated AML test build definitions

* updated AML test build definitions

* adding job to aml_build.yml

* updating example config for testing

* modifying the test_train_pipeline.py to have appropriate number of pipeline steps and other required modifications

* updating AML_pipeline_tests in aml_build.yml to consume environment variables

* updating scriptType, sciptLocation, and inlineScript in aml_build.yml

* trivial commit to re-trigger broken build pipelines

* fix to aml yml build to use env vars for secrets and everything else

* another yml fix

* another yml fix

* reverting structure format of jobs for aml_build pipeline tests

* updating path to test_train_pipeline.py

* aml_pipeline_tests timed out, extending timeoutInMinutes from 10 to 40

* adding additional pytest

* adding az login

* updating variables in aml pipeline tests

Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>
Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* moved contrib contributions around from CSE

* fixed dataloader tests - updated them to work with new code from staging branch

* segyconverter notebooks and tests run and pass; updated documentation

* added test job for segy converter notebooks

* removed AML training pipeline from this release

* fixed training model tolerance precision in the tests - wasn't working

* fixed train.py build issues after the merge

* addressed PR comments

* fixed bug in check_performance

Co-authored-by: Sharat Chikkerur <sharat.chikkerur@microsoft.com>
Co-authored-by: kirasoderstrom <kirasoderstrom@gmail.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>
Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>
Co-authored-by: Ricardo Squassina Lee <8495707+squassina@users.noreply.github.com>
Co-authored-by: Michael Zawacki <mikezawacki@hotmail.com>
Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>

* make tests simpler (#368)

* removed Dutch F3 job from main_build

* fixed a bug in data subset in debug mode

* modified epoch numbers to pass the performance checks, checkedout check_performance from Max's branch

* modified get_data_for_builds.sh to set up checkerboard data for smaller size, minor improvements on gen_checkerboard

* send all the batches, disabled the performance checks for patch_deconvnet

* added comment to enable tests for patch_deconvnet after debugging, renamed gen_checkerboard, added options to new arg per Max's suggestion

* Replace HRNet with SEResNet model in the notebook (#362)

* replaced HRNet with SEResNet model in the notebook

* removed debugging cell info

* fixed bug where resnet_unet model wasn't loading the pre-trained version in the notebook

* fixed build VM problems

* Multi-GPU training support (#359)

* Data flow tests (#375)

* renamed checkerboard job name

* restructured default outputs from test.py to be dumped under output dir and not debug dir

* test.py output re-org

* removed outdated variable from check_performance.py

* intermediate work

* intermediate work

* bunch of intermediate works

* changing args for different trainings

* final to run dev_build"

* remove print statements

* removed print statement

* removed suppressed lines

* added assertion error msg

* added assertion error msg, one intential bug to test

* testing a stupid bug

* debug

* omg

* final

* trigger build

* fixed multi-GPU termination in train.py (#379)

* PR to fix #371 and #372  (#380)

* added learning rate to logs

* changed epoch for patch_deconvnet, and enabled the tests

* removed TODOs

* changed tensorflow pinned version (#387)

* changed tensorflow pinned version

* trigger build

* closes 385 (#389)

* Fixing #259 by adding symmetric padding along depth direction  (#386)

* BYOD Penobscot (#390)

* minor updates to files

* added penobscot conversion code

* docker build test (#388)

* added a new job to test bulding the docker, for now it is daisy-chained to the end

* this is just a TEST

* test

* test

* remove old image

* debug

* debug

* test

* debug

* enabled all the jobs

* quick fix

* removing non-tagged iamges

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* added missing license headers and fixed formatting (#391)

* added missing license headers and fixed formatting

* some more license headers

* updated documentation to close 354 and 381 (#392)

* fix test.py and notebook issues (#394)

* resolved conflicts for 0.2 release (#396)

* V00.01.00003 release (#356)

* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

* typos

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

* tensorboard notebook fix & loading of pre-trained models fix (#397)

Co-authored-by: Max Kaznady <max.kaznady@gmail.com>

* Docker README corrections and pretrained model checking (#398)

* added better instructions to Docker readme; removed HRNet references

* added checking of pre-trained models on startup

* Update docker/README.md

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>

* added more README changes and a video link with overview

* readme tweaks

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>

* finalized performance metrics (#399)

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@microsoft.com>
Co-authored-by: kirasoderstrom <kirasoderstrom@gmail.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>
Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>
Co-authored-by: Ricardo Squassina Lee <8495707+squassina@users.noreply.github.com>
Co-authored-by: Michael Zawacki <mikezawacki@hotmail.com>
Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>
Co-authored-by: Max Kaznady <max.kaznady@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Prior: High Type: Bug Something isn't working Type: Enhancement This an enhancement to an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants