-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
model parallel training research (#616)
* adds network * adds basic training * update loading * working prototype * update validation set * [MONAI] Add author; paper info; PDDCA18 (#6) + Author + Early accept + PDDCA18 link * Update README.md * adds network * adds basic training * update loading * working prototype * update validation set * [MONAI] Update TRAIN_PATH, VAL_PATH (#8) + Update TRAIN_PATH, VAL_PATH * [MONAI] Add data link (#7) + Add data link https://drive.google.com/file/d/1A2zpVlR3CkvtkJPvtAF3-MH0nr1WZ2Mn/view?usp=sharing * fixes typos * tested new dataset * print more infor, checked new dataset * [MONAI] Add paper link (#9) Add paper link https://arxiv.org/abs/2006.12575 * [MONAI] Use dice loss + focal loss to train (#10) Use dice loss + focal loss to train * [MONAI] Support non-one-hot ground truth (#11) Support non-one-hot ground truth * fixes format and docstrings, adds argparser options * resume the focal_loss * adds tests * [MONAI] Support non-one-hot ground truth (#11) Support non-one-hot ground truth * adds tests * update docstring * [MONAI] Keep track of best validation scores (#12) Keep track of best validation scores * model saving * adds window sampling * update readme * update docs * fixes flake8 error * update window sampling * fixes model name * fixes channel size issue * [MONAI] Update --pretrain, --lr (#13) + lr from 5e-4 to 1e-3 because we use mean for class channel instead of sum for class channel. + pretrain path is consistent with current model_name. * [MONAI] Pad image; elastic; best class model (#14) * [MONAI] Pad image; elastic; best class model + Pad image bigger than crop_size, avoid potential issues in RandCropByPosNegLabeld + Use Rand3DElasticd + Save best model for each class * Update train.py Co-authored-by: Wenqi Li <wenqil@nvidia.com> * flake8 fixes * removes -1 cropsize deform * testing commands * fixes unit tests * update spatial padding * [MONAI] Add full image deform augmentation (#15) + Add full image deform augmentation by Rand3DElasticd + Please use latest MONAI in #623 * Adding py.typed * updating setup.py to comply with black * update based on comments * excluding research from packaging * update tests * update setup.py Co-authored-by: Wentao Zhu <wentaozhu1991@gmail.com> Co-authored-by: Neil Tenenholtz <ntenenz@users.noreply.github.com> Co-authored-by: Nic Ma <nma@nvidia.com>
- Loading branch information
1 parent
f262355
commit 379c959
Showing
8 changed files
with
595 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# LAMP: Large Deep Nets with Automated Model Parallelism for Image Segmentation | ||
|
||
<p> | ||
<img src="./fig/acc_speed_han_0_5hor.png" alt="LAMP on Head and Neck Dataset" width="500"/> | ||
</p> | ||
|
||
|
||
> If you use this work in your research, please cite the paper. | ||
A reimplementation of the LAMP system originally proposed by: | ||
|
||
Wentao Zhu, Can Zhao, Wenqi Li, Holger Roth, Ziyue Xu, and Daguang Xu (2020) | ||
"LAMP: Large Deep Nets with Automated Model Parallelism for Image Segmentation." | ||
MICCAI 2020 (Early Accept, paper link: https://arxiv.org/abs/2006.12575) | ||
|
||
|
||
## To run the demo: | ||
|
||
### Prerequisites | ||
- install the latest version of MONAI: `git clone https://github.com/Project-MONAI/MONAI` and `pip install -e .` | ||
- `pip install torchgpipe` | ||
|
||
### Data | ||
```bash | ||
mkdir ./data; | ||
cd ./data; | ||
``` | ||
Head and Neck CT dataset | ||
|
||
Please download and unzip the images into `./data` folder. | ||
|
||
- `HaN.zip`: https://drive.google.com/file/d/1A2zpVlR3CkvtkJPvtAF3-MH0nr1WZ2Mn/view?usp=sharing | ||
```bash | ||
unzip HaN.zip; # unzip | ||
``` | ||
|
||
Please find more details of the dataset at https://github.com/wentaozhu/AnatomyNet-for-anatomical-segmentation.git | ||
|
||
|
||
### Minimal hardware requirements for full image training | ||
- U-Net (`n_feat=32`): 2x 16Gb GPUs | ||
- U-Net (`n_feat=64`): 4x 16Gb GPUs | ||
- U-Net (`n_feat=128`): 2x 32Gb GPUs | ||
|
||
|
||
### Commands | ||
The number of features in the first block (`--n_feat`) can be 32, 64, or 128. | ||
```bash | ||
mkdir ./log; | ||
python train.py --n_feat=128 --crop_size='64,64,64' --bs=16 --ep=4800 --lr=0.001 > ./log/YOURLOG.log | ||
python train.py --n_feat=128 --crop_size='128,128,128' --bs=4 --ep=1200 --lr=0.001 --pretrain='./HaN_32_16_1200_64,64,64_0.001_*' > ./log/YOURLOG.log | ||
python train.py --n_feat=128 --crop_size='-1,-1,-1' --bs=1 --ep=300 --lr=0.001 --pretrain='./HaN_32_16_1200_64,64,64_0.001_*' > ./log/YOURLOG.log | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# Copyright 2020 MONAI Consortium | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Copyright 2020 MONAI Consortium | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import os | ||
import numpy as np | ||
from monai.transforms import DivisiblePad | ||
|
||
STRUCTURES = ( | ||
"BrainStem", | ||
"Chiasm", | ||
"Mandible", | ||
"OpticNerve_L", | ||
"OpticNerve_R", | ||
"Parotid_L", | ||
"Parotid_R", | ||
"Submandibular_L", | ||
"Submandibular_R", | ||
) | ||
|
||
|
||
def get_filenames(path, maskname=STRUCTURES): | ||
""" | ||
create file names according to the predefined folder structure. | ||
Args: | ||
path: data folder name | ||
maskname: target structure names | ||
""" | ||
maskfiles = [] | ||
for seg in maskname: | ||
if os.path.exists(os.path.join(path, "./structures/" + seg + "_crp_v2.npy")): | ||
maskfiles.append(os.path.join(path, "./structures/" + seg + "_crp_v2.npy")) | ||
else: | ||
# the corresponding mask is missing seg, path.split("/")[-1] | ||
maskfiles.append(None) | ||
return os.path.join(path, "img_crp_v2.npy"), maskfiles | ||
|
||
|
||
def load_data_and_mask(data, mask_data): | ||
""" | ||
Load data filename and mask_data (list of file names) | ||
into a dictionary of {'image': array, "label": list of arrays, "name": str}. | ||
""" | ||
pad_xform = DivisiblePad(k=32) | ||
img = np.load(data) # z y x | ||
img = pad_xform(img[None])[0] | ||
item = dict(image=img, label=[]) | ||
for idx, maskfnm in enumerate(mask_data): | ||
if maskfnm is None: | ||
ms = np.zeros(img.shape, np.uint8) | ||
else: | ||
ms = np.load(maskfnm).astype(np.uint8) | ||
assert ms.min() == 0 and ms.max() == 1 | ||
mask = pad_xform(ms[None])[0] | ||
item["label"].append(mask) | ||
assert len(item["label"]) == 9 | ||
item["name"] = str(data) | ||
return item |
Binary file added
BIN
+128 KB
research/lamp-automated-model-parallelism/fig/acc_speed_han_0_5hor.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
52 changes: 52 additions & 0 deletions
52
research/lamp-automated-model-parallelism/test_unet_pipe.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# Copyright 2020 MONAI Consortium | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import unittest | ||
|
||
import torch | ||
from parameterized import parameterized | ||
|
||
from unet_pipe import UNetPipe | ||
|
||
TEST_CASES = [ | ||
[ # 1-channel 3D, batch 12 | ||
{"spatial_dims": 3, "out_channels": 2, "in_channels": 1, "depth": 3, "n_feat": 8}, | ||
torch.randn(12, 1, 32, 64, 48), | ||
(12, 2, 32, 64, 48), | ||
], | ||
[ # 1-channel 3D, batch 16 | ||
{"spatial_dims": 3, "out_channels": 2, "in_channels": 1, "depth": 3}, | ||
torch.randn(16, 1, 32, 64, 48), | ||
(16, 2, 32, 64, 48), | ||
], | ||
[ # 4-channel 3D, batch 16, batch normalisation | ||
{"spatial_dims": 3, "out_channels": 3, "in_channels": 2}, | ||
torch.randn(16, 2, 64, 64, 64), | ||
(16, 3, 64, 64, 64), | ||
], | ||
] | ||
|
||
|
||
class TestUNETPipe(unittest.TestCase): | ||
@parameterized.expand(TEST_CASES) | ||
def test_shape(self, input_param, input_data, expected_shape): | ||
net = UNetPipe(**input_param) | ||
if torch.cuda.is_available(): | ||
net = net.to(torch.device("cuda")) | ||
input_data = input_data.to(torch.device("cuda")) | ||
net.eval() | ||
with torch.no_grad(): | ||
result = net.forward(input_data.float()) | ||
self.assertEqual(result.shape, expected_shape) | ||
|
||
|
||
if __name__ == "__main__": | ||
unittest.main() |
Oops, something went wrong.