-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
configurable normalizations #68
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ziw-liu
reviewed
Feb 24, 2024
ziw-liu
reviewed
Feb 24, 2024
@ziw-liu just tested with new dataset using the |
ziw-liu
approved these changes
Feb 26, 2024
ziw-liu
added a commit
that referenced
this pull request
Apr 8, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * compose normalizations for predict and test stages * black * fix normalization in example config * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * removing normalize_source from configs. * typing fixes * fix test data path * fix test dataset * add docstring for ConcatDataModule * format --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
ziw-liu
added a commit
that referenced
this pull request
Jun 11, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
edyoshikun
added a commit
that referenced
this pull request
Jun 12, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * compose normalizations for predict and test stages * black * fix normalization in example config * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * removing normalize_source from configs. * typing fixes * fix test data path * fix test dataset * add docstring for ConcatDataModule * format --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
edyoshikun
added a commit
that referenced
this pull request
Jun 12, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
edyoshikun
added a commit
that referenced
this pull request
Jun 12, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * compose normalizations for predict and test stages * black * fix normalization in example config * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * removing normalize_source from configs. * typing fixes * fix test data path * fix test dataset * add docstring for ConcatDataModule * format --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
edyoshikun
added a commit
that referenced
this pull request
Jun 12, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
edyoshikun
added a commit
that referenced
this pull request
Jun 12, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * compose normalizations for predict and test stages * black * fix normalization in example config * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * removing normalize_source from configs. * typing fixes * fix test data path * fix test dataset * add docstring for ConcatDataModule * format --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
edyoshikun
added a commit
that referenced
this pull request
Jun 12, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
edyoshikun
added a commit
that referenced
this pull request
Jun 18, 2024
* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com>
ziw-liu
added a commit
that referenced
this pull request
Jul 10, 2024
…bel-free images (#70) * refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * added model and annotation code draft * chnaged to simple unet model * start with lesser augmentations * added readme file * added tensorboard logging * added validation step * chnaged to viscy 2d unet * used crossentropyloss with one-hot encoding * added sample image logging * attempt to build magicgui annotation * renamed infection annotation tool * added normalization and augmentations * added model testing code * removed annotation refiner * corrected conversion of class to int * corrected prediction module * cleaned up the code and comments for the LightningUNet * removed confusion matrix code, finding runtime error with model * moved scripts to viscy.scripts.infection_phenotyping module to enable imports across scripts * combine the lightning modules for training and prediction, fix the DDP exception * all the stubs for computing and logging confusion matrix per cell * separated training and test scripts * lightning module * corrected test cm compute * corrected test module * separated test and prediction scripts * changed confusion matrix compute * fix merge error * split 2D and 2.5D model scripts * added covnext script * fix model input parameter * update input file * add augmentations * refactor infection_classification code to viscy/applications * changes made for BJ5 classification * format code * add explicit packaging list * rename testing script * update readme * move function to preprocessing * format code * formatting * histogram with dask * fix index and test * fix import * black * fix float comp * clean up headers * clean up import * add argument to change number of classes --------- Co-authored-by: Ziwen Liu <ziwen.liu@czbiohub.org> Co-authored-by: Eduardo Hirata-Miyasaki <edhiratam@gmail.com> Co-authored-by: Shalin Mehta <shalin.mehta@czbiohub.org> Co-authored-by: Ziwen Liu <67518483+ziw-liu@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds configurable normalizations for both the source and targets similar to the transformations/augmentations that are applied to the samples.