diff --git a/CHANGELOG.md b/CHANGELOG.md index 9f232ac79..cf81acf80 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,258 @@ -# cuCIM 22.02.00 (Date TBD) +# cuCIM 22.02.00 (2 Feb 2022) -Please see https://github.com/rapidsai/cucim/releases/tag/v22.02.00a for the latest changes to this development branch. +## 🚨 Breaking Changes + +- Update cucim.skimage API to match scikit-image 0.19 ([#190](https://github.com/rapidsai/cucim/pull/190)) [@glee77](https://github.com/glee77) + +## 🐛 Bug Fixes + +- Fix a bug in [v21.12.01](https://github.com/rapidsai/cucim/wiki/release_notes_v21.12.01) ([#191](https://github.com/rapidsai/cucim/pull/191)) [@gigony](https://github.com/gigony) + - Fix GPU memory leak when using nvJPEG API (when `device='cuda'` parameter is used in `read_region` method). +- Fix segfault for preferred_memory_capacity in Python 3.9+ ([#214](https://github.com/rapidsai/cucim/pull/214)) [@gigony](https://github.com/gigony) + +## 📖 Documentation + +- PyPI v21.12.00 release ([#182](https://github.com/rapidsai/cucim/pull/182)) [@gigony](https://github.com/gigony) + +## 🚀 New Features + +1. Update cucim.skimage API to match scikit-image 0.19 ([#190](https://github.com/rapidsai/cucim/pull/190)) [@glee77](https://github.com/glee77) +2. Support multi-threads and batch, and support nvJPEG for JPEG-compressed images ([#191](https://github.com/rapidsai/cucim/pull/191)) [@gigony](https://github.com/gigony) +3. Allow CuPy 10 ([#195](https://github.com/rapidsai/cucim/pull/195)) [@jakikham](https://github.com/jakikham) + +### 1. Update cucim.skimage API to match scikit-image 0.19 (🚨 Breaking Changes) + +#### channel_axis support + +scikit-image 0.19 adds a `channel_axis` argument that should now be used instead of the `multichannel` boolean. + +In scikit-image 1.0, the `multichannel` argument will likely be removed so we start supporting `channel_axis` in cuCIM. + +This pulls changes from many scikit-image 0.19.0 PRs related to deprecating `multichannel` in favor of `channel_axis`. A few other minor PRs related to deprecations and updates to `color.label2rgb` are incorporated here as well. + +The changes are mostly non-breaking, although a couple of deprecated functions have been removed (`rgb2grey`, `grey2rgb`) and a change in the default value of `label2rgb`'s `bg_label` argument. The deprecated `alpha` argument was removed from gray2rgb. + +Implements: + +- [Add saturation parameter to color.label2rgb #5156](https://github.com/scikit-image/scikit-image/pull/5156) +- [Decorators for helping with the multichannel->channel_axis transition #5228](https://github.com/scikit-image/scikit-image/pull/5228) +- [multichannel to channel_axis (1 of 6): features and draw #5284](https://github.com/scikit-image/scikit-image/pull/5284) +- [multichannel to channel_axis (2 of 6): transform functions #5285](https://github.com/scikit-image/scikit-image/pull/5285) +- [multichannel to channel_axis (3 of 6): filters #5286](https://github.com/scikit-image/scikit-image/pull/5286) +- [multichannel to channel_axis (4 of 6): metrics and measure #5287](https://github.com/scikit-image/scikit-image/pull/5287) +- [multichannel to channel_axis (5 of 6): restoration #5288](https://github.com/scikit-image/scikit-image/pull/5288) +- [multichannel to channel_axis (6 of 6): segmentation #5289](https://github.com/scikit-image/scikit-image/pull/5289) +- [channel_as_last_axis decorator fix #5348](https://github.com/scikit-image/scikit-image/pull/5348) +- [fix wrong error for metric.structural_similarity when image is too small #5395](https://github.com/scikit-image/scikit-image/pull/5395) +- [Add a channel_axis argument to functions in the skimage.color module #5462](https://github.com/scikit-image/scikit-image/pull/5462) +- [Remove deprecated functions and arguments for the 0.19 release #5463](https://github.com/scikit-image/scikit-image/pull/5463) +- [Support nD images and labels in label2rgb #5550](https://github.com/scikit-image/scikit-image/pull/5550) +- [remove need for channel_as_last_axis decorator in skimage.filters #5584](https://github.com/scikit-image/scikit-image/pull/5584) +- [Preserve backwards compatibility for `channel_axis` parameter in transform functions #6095](https://github.com/scikit-image/scikit-image/pull/6095) + +#### Update float32 dtype support to match scikit-image 0.19 behavior + +Makes float32 and float16 handling consistent with scikit-image 0.19. (must functions support float32, float16 gets promoted to float32) + +#### Deprecate APIs + +Introduces new deprecations as in scikit-image 0.19. + +Specifically: + +- `selem` -> `footprint` +- `grey` -> `gray` +- `iterations` -> `num_iter` +- `max_iter` -> `max_num_iter` +- `min_iter` -> `min_num_iter` + +### 2. Supporting Multithreading and Batch Processing + +cuCIM now supports loading the entire image with multi-threads. It also supports batch loading of images. + +If `device` parameter of `read_region()` method is `"cuda"`, it loads a relevant portion of the image file (compressed tile data) into GPU memory using cuFile(GDS, GPUDirect Storage), then decompress those data using nvJPEG's [Batched Image Decoding API](https://docs.nvidia.com/cuda/nvjpeg/index.html#nvjpeg-batched-image-decoding). + +Current implementations are not efficient and performance is poor compared to CPU implementations. However, we plan to improve it over the next versions. + +#### Example API Usages + +The following parameters would be added in the `read_region` method: + +- `num_workers`: number of workers(threads) to use for loading the image. (default: `1`) +- `batch_size`: number of images to load at once. (default: `1`) +- `drop_last`: whether to drop the last batch if the batch size is not divisible by the number of images. (default: `False`) +- `preferch_factor`: number of samples loaded in advance by each worker. (default: `2`) +- `shuffle`: whether to shuffle the input locations (default: `False`) +- `seed`: seed value for random value generation (default: 0) + +**Loading entire image by using multithreads** + +```python +from cucim import CuImage + +img = CuImage("input.tif") + +region = img.read_region(level=1, num_workers=8) # read whole image at level 1 using 8 workers +``` + +**Loading batched image using multithreads** + +You can feed locations of the region through the list/tuple of locations or through the NumPy array of locations. +(e.g., `((, ), (, )])`). +Each element in the location should be int type (int64) and the dimension of the location should be +equal to the dimension of the size. +You can feed any iterator of locations (dimensions of the input don't matter, flattening the item in the iterator once if the item is also an iterator). + +For example, you can feed the following iterator: + +- `[0, 0, 100, 0]` or `(0, 0, 100, 0)` would be interpreted as a list of `(0, 0)` and `(100, 0)`. +- `((sx, sy) for sy in range(0, height, patch_size) for sx in range(0, width, patch_size))` would iterate over the locations of the patches. +- `[(0, 100), (0, 200)]` would be interpreted as a list of `(0, 0)` and `(100, 0)`. +- Numpy array such as `np.array(((0, 100), (0, 200)))` or `np.array((0, 100, 0, 200))` would be also available and using Numpy array object would be faster than using python list/tuple. + +```python +import numpy as np +from cucim import CuImage + +cache = CuImage.cache("per_process", memory_capacity=1024) + +img = CuImage("image.tif") + +locations = [[0, 0], [100, 0], [200, 0], [300, 0], + [0, 200], [100, 200], [200, 200], [300, 200]] +# locations = np.array(locations) + +region = img.read_region(locations, (224, 224), batch_size=4, num_workers=8) + +for batch in region: + img = np.asarray(batch) + print(img.shape) + for item in img: + print(item.shape) + +# (4, 224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (4, 224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +``` + +**Loading image using nvJPEG and cuFile (GDS, GPUDirect Storage)** + +If `cuda` argument is specified in `device` parameter of `read_region()` method, it uses nvJPEG with GPUDirect Storage to load images. + +Use CuPy instead of Numpy, and Image Cache (`CuImage.cache`) wouldn't be used in the case. + +```python +import cupy as cp +from cucim import CuImage + +img = CuImage("image.tif") + +locations = [[0, 0], [100, 0], [200, 0], [300, 0], + [0, 200], [100, 200], [200, 200], [300, 200]] +# locations = np.array(locations) + +region = img.read_region(locations, (224, 224), batch_size=4, device="cuda") + +for batch in region: + img = cp.asarray(batch) + print(img.shape) + for item in img: + print(item.shape) + +# (4, 224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (4, 224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +# (224, 224, 3) +``` + +#### Experimental Results + +We have compared performance against Tifffile for loading the entire image. + +##### System Information + +- OS: Ubuntu 18.04 +- CPU: [Intel(R) Core(TM) i7-7800X CPU @ 3.50GHz](https://www.cpubenchmark.net/cpu.php?cpu=Intel+Core+i7-7800X+%40+3.50GHz&id=3037), 12 processors. +- Memory: 64GB (G-Skill DDR4 2133 16GB X 4) +- Storage + - SATA SSD: [Samsung SSD 850 EVO 1TB](https://www.samsung.com/us/computing/memory-storage/solid-state-drives/ssd-850-evo-2-5-sata-iii-1tb-mz-75e1t0b-am/) + +##### Experiment Setup + +Benchmarked loading several images with [Tifffile](https://github.com/cgohlke/tifffile). ++ Use read_region() APIs to read the entire image (.svs/.tiff) at the largest resolution level. + - Performed on the following images that use a different compression method + * JPEG2000 YCbCr: [TUPAC-TR-467.svs](https://drive.google.com/drive/u/0/folders/0B--ztKW0d17XYlBqOXppQmw0M2M), 55MB, 19920x26420, tile size 240x240 + * JPEG: image.tif (256x256 multi-resolution/tiled TIF conversion of TUPAC-TR-467.svs), 238MB, 19920x26420, tile size 256x256 + * JPEG2000 RGB: [CMU-1-JP2K-33005.svs](https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/), 126MB, 46000x32893, tile size 240x240 + * JPEG: [0005f7aaab2800f6170c399693a96917.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data) in [Prostate cANcer graDe Assessment (PANDA) Challenge](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 46MB, 27648x29440, tile size 512x512 + * JPEG: [000920ad0b612851f8e01bcc880d9b3d.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data) in [Prostate cANcer graDe Assessment (PANDA) Challenge](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 14MB, 15360x13312, tile size 512x512 + * JPEG: [001d865e65ef5d2579c190a0e0350d8f.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data) in [Prostate cANcer graDe Assessment (PANDA) Challenge](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 71MB, 28672x34560, tile size 512x512 + ++ Use the same number of workers (threads) for both cuCIM and Tifffile. + - Tifffile uses half of the available processors by default (6 in the test system) + - Tested with 6 and 12 threads ++ Use the average time of 5 samples. ++ Test code is available at [here](https://gist.github.com/gigony/260d152a83519614ca8c46df551f0d57) + +##### Results + ++ JPEG2000 YCbCr: [TUPAC-TR-467.svs](https://drive.google.com/drive/u/0/folders/0B--ztKW0d17XYlBqOXppQmw0M2M), 55MB, 19920x26420, tile size 240x240 + - cuCIM [6 threads]: 2.7688472287729384 + - tifffile [6 threads]: 7.4588409311138095 + - cuCIM [12 threads]: 2.1468488964252175 + - tifffile [12 threads]: 6.142562598735094 ++ JPEG: image.tif (256x256 multi-resolution/tiled TIF conversion of TUPAC-TR-467.svs), 238MB, 19920x26420, tile size 256x256 + - cuCIM [6 threads]: 0.6951584462076426 + - tifffile [6 threads]: 1.0252630705013872 + - cuCIM [12 threads]: 0.5354489935562015 + - tifffile [12 threads]: 1.5688881931826473 ++ JPEG2000 RGB: [CMU-1-JP2K-33005.svs](https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/), 126MB, 46000x32893, tile size 240x240 + - cuCIM [6 threads]: 9.2361351958476 + - tifffile [6 threads]: 27.936951795965435 + - cuCIM [12 threads]: 7.4136177686043085 + - tifffile [12 threads]: 22.46532293939963 ++ JPEG: [0005f7aaab2800f6170c399693a96917.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 46MB, 27648x29440, tile size 512x512 + - cuCIM [6 threads]: 0.7972335423342883 + - tifffile [6 threads]: 0.926042037177831 + - cuCIM [12 threads]: 0.6366931471042335 + - tifffile [12 threads]: 0.9512427857145667 ++ JPEG: [000920ad0b612851f8e01bcc880d9b3d.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 14MB, 15360x13312, tile size 512x512 + - cuCIM [6 threads]: 0.2257618647068739 + - tifffile [6 threads]: 0.25579613661393524 + - cuCIM [12 threads]: 0.1840262952260673 + - tifffile [12 threads]: 0.2717844221740961 ++ JPEG: [001d865e65ef5d2579c190a0e0350d8f.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 71MB, 28672x34560, tile size 512x512 + - cuCIM [6 threads]: 0.9925791253335774 + - tifffile [6 threads]: 1.131185239739716 + - cuCIM [12 threads]: 0.8037087645381689 + - tifffile [12 threads]: 1.1474561678245663 + +### 3. Allow CuPy 10 + +Relaxes version constraints to allow CuPy 10 (in meta.yaml). + +`cupy 9.*` => `cupy >=9,<11.0.0a0` + +## 🛠️ Improvements + +- Add missing imports tests ([#183](https://github.com/rapidsai/cucim/pull/183)) [@Ethyling](https://github.com/Ethyling) +- Allow installation with CuPy 10 ([#197](https://github.com/rapidsai/cucim/pull/197)) [@glee77](https://github.com/glee77) +- Upgrade Numpy to 1.18 for Python 3.9 support ([#196](https://github.com/rapidsai/cucim/pull/196)) [@Ethyling](https://github.com/Ethyling) +- Upgrade Numpy to 1.19 for Python 3.9 support ([#203](https://github.com/rapidsai/cucim/pull/203)) [@Ethyling](https://github.com/Ethyling) # cuCIM 21.12.00 (9 Dec 2021)