diff --git a/CHANGELOG.md b/CHANGELOG.md
index 9f232ac79..cf81acf80 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,258 @@
-# cuCIM 22.02.00 (Date TBD)
+# cuCIM 22.02.00 (2 Feb 2022)
 
-Please see https://github.com/rapidsai/cucim/releases/tag/v22.02.00a for the latest changes to this development branch.
+## 🚨 Breaking Changes
+
+- Update cucim.skimage API to match scikit-image 0.19 ([#190](https://github.com/rapidsai/cucim/pull/190)) [@glee77](https://github.com/glee77)
+
+## 🐛 Bug Fixes
+
+- Fix a bug in [v21.12.01](https://github.com/rapidsai/cucim/wiki/release_notes_v21.12.01) ([#191](https://github.com/rapidsai/cucim/pull/191)) [@gigony](https://github.com/gigony)
+  - Fix GPU memory leak when using nvJPEG API (when `device='cuda'` parameter is used in `read_region` method).
+- Fix segfault for preferred_memory_capacity in Python 3.9+ ([#214](https://github.com/rapidsai/cucim/pull/214)) [@gigony](https://github.com/gigony)
+
+## 📖 Documentation
+
+- PyPI v21.12.00 release ([#182](https://github.com/rapidsai/cucim/pull/182)) [@gigony](https://github.com/gigony)
+
+## 🚀 New Features
+
+1. Update cucim.skimage API to match scikit-image 0.19 ([#190](https://github.com/rapidsai/cucim/pull/190)) [@glee77](https://github.com/glee77)
+2. Support multi-threads and batch, and support nvJPEG for JPEG-compressed images ([#191](https://github.com/rapidsai/cucim/pull/191)) [@gigony](https://github.com/gigony)
+3. Allow CuPy 10 ([#195](https://github.com/rapidsai/cucim/pull/195)) [@jakikham](https://github.com/jakikham)
+
+### 1. Update cucim.skimage API to match scikit-image 0.19 (🚨 Breaking Changes)
+
+#### channel_axis support
+
+scikit-image 0.19 adds a `channel_axis` argument that should now be used instead of the `multichannel` boolean.
+
+In scikit-image 1.0, the `multichannel` argument will likely be removed so we start supporting `channel_axis` in cuCIM.
+
+This pulls changes from many scikit-image 0.19.0 PRs related to deprecating `multichannel` in favor of `channel_axis`. A few other minor PRs related to deprecations and updates to `color.label2rgb` are incorporated here as well.
+
+The changes are mostly non-breaking, although a couple of deprecated functions have been removed (`rgb2grey`, `grey2rgb`) and a change in the default value of `label2rgb`'s `bg_label` argument. The deprecated `alpha` argument was removed from gray2rgb.
+
+Implements:
+
+- [Add saturation parameter to color.label2rgb #5156](https://github.com/scikit-image/scikit-image/pull/5156)
+- [Decorators for helping with the multichannel->channel_axis transition #5228](https://github.com/scikit-image/scikit-image/pull/5228)
+- [multichannel to channel_axis (1 of 6): features and draw #5284](https://github.com/scikit-image/scikit-image/pull/5284)
+- [multichannel to channel_axis (2 of 6): transform functions #5285](https://github.com/scikit-image/scikit-image/pull/5285)
+- [multichannel to channel_axis (3 of 6): filters #5286](https://github.com/scikit-image/scikit-image/pull/5286)
+- [multichannel to channel_axis (4 of 6): metrics and measure #5287](https://github.com/scikit-image/scikit-image/pull/5287)
+- [multichannel to channel_axis (5 of 6): restoration #5288](https://github.com/scikit-image/scikit-image/pull/5288)
+- [multichannel to channel_axis (6 of 6): segmentation #5289](https://github.com/scikit-image/scikit-image/pull/5289)
+- [channel_as_last_axis decorator fix #5348](https://github.com/scikit-image/scikit-image/pull/5348)
+- [fix wrong error for metric.structural_similarity when image is too small #5395](https://github.com/scikit-image/scikit-image/pull/5395)
+- [Add a channel_axis argument to functions in the skimage.color module #5462](https://github.com/scikit-image/scikit-image/pull/5462)
+- [Remove deprecated functions and arguments for the 0.19 release #5463](https://github.com/scikit-image/scikit-image/pull/5463)
+- [Support nD images and labels in label2rgb #5550](https://github.com/scikit-image/scikit-image/pull/5550)
+- [remove need for channel_as_last_axis decorator in skimage.filters #5584](https://github.com/scikit-image/scikit-image/pull/5584)
+- [Preserve backwards compatibility for `channel_axis` parameter in transform functions #6095](https://github.com/scikit-image/scikit-image/pull/6095)
+
+#### Update float32 dtype support to match scikit-image 0.19 behavior
+
+Makes float32 and float16 handling consistent with scikit-image 0.19. (must functions support float32, float16 gets promoted to float32)
+
+#### Deprecate APIs
+
+Introduces new deprecations as in scikit-image 0.19.
+
+Specifically:
+
+- `selem` -> `footprint`
+- `grey` -> `gray`
+- `iterations` -> `num_iter`
+- `max_iter` -> `max_num_iter`
+- `min_iter` -> `min_num_iter`
+
+### 2. Supporting Multithreading and Batch Processing
+
+cuCIM now supports loading the entire image with multi-threads. It also supports batch loading of images.
+
+If `device` parameter of `read_region()` method is `"cuda"`, it loads a relevant portion of the image file (compressed tile data) into GPU memory using cuFile(GDS, GPUDirect Storage), then decompress those data using nvJPEG's [Batched Image Decoding API](https://docs.nvidia.com/cuda/nvjpeg/index.html#nvjpeg-batched-image-decoding).
+
+Current implementations are not efficient and performance is poor compared to CPU implementations. However, we plan to improve it over the next versions.
+
+#### Example API Usages
+
+The following parameters would be added in the `read_region` method:
+
+- `num_workers`: number of workers(threads) to use for loading the image. (default: `1`)
+- `batch_size`: number of images to load at once. (default: `1`)
+- `drop_last`: whether to drop the last batch if the batch size is not divisible by the number of images. (default: `False`)
+- `preferch_factor`: number of samples loaded in advance by each worker. (default: `2`)
+- `shuffle`: whether to shuffle the input locations (default: `False`)
+- `seed`: seed value for random value generation (default: 0)
+
+**Loading entire image by using multithreads**
+
+```python
+from cucim import CuImage
+
+img = CuImage("input.tif")
+
+region = img.read_region(level=1, num_workers=8)  # read whole image at level 1 using 8 workers
+```
+
+**Loading batched image using multithreads**
+
+You can feed locations of the region through the list/tuple of locations or through the NumPy array of locations.
+(e.g., `((<x for loc 1>, <y for loc 1>), (<x for loc 2>, <y for loc 2>)])`).
+Each element in the location should be int type (int64) and the dimension of the location should be
+equal to the dimension of the size.
+You can feed any iterator of locations (dimensions of the input don't matter, flattening the item in the iterator once if the item is also an iterator).
+
+For example, you can feed the following iterator:
+
+- `[0, 0, 100, 0]` or `(0, 0, 100, 0)` would be interpreted as a list of `(0, 0)` and `(100, 0)`.
+- `((sx, sy) for sy in range(0, height, patch_size) for sx in range(0, width, patch_size))` would iterate over the locations of the patches.
+- `[(0, 100), (0, 200)]` would be interpreted as a list of `(0, 0)` and `(100, 0)`.
+- Numpy array such as `np.array(((0, 100), (0, 200)))` or `np.array((0, 100, 0, 200))` would be also available and using Numpy array object would be faster than using python list/tuple.
+
+```python
+import numpy as np
+from cucim import CuImage
+
+cache = CuImage.cache("per_process", memory_capacity=1024)
+
+img = CuImage("image.tif")
+
+locations = [[0,   0], [100,   0], [200,   0], [300,   0],
+             [0, 200], [100, 200], [200, 200], [300, 200]]
+# locations = np.array(locations)
+
+region = img.read_region(locations, (224, 224), batch_size=4, num_workers=8)
+
+for batch in region:
+    img = np.asarray(batch)
+    print(img.shape)
+    for item in img:
+        print(item.shape)
+
+# (4, 224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (4, 224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+```
+
+**Loading image using nvJPEG and cuFile (GDS, GPUDirect Storage)**
+
+If `cuda` argument is specified in `device` parameter of `read_region()` method, it uses nvJPEG with GPUDirect Storage to load images.
+
+Use CuPy instead of Numpy, and Image Cache (`CuImage.cache`) wouldn't be used in the case.
+
+```python
+import cupy as cp
+from cucim import CuImage
+
+img = CuImage("image.tif")
+
+locations = [[0,   0], [100,   0], [200,   0], [300,   0],
+             [0, 200], [100, 200], [200, 200], [300, 200]]
+# locations = np.array(locations)
+
+region = img.read_region(locations, (224, 224), batch_size=4, device="cuda")
+
+for batch in region:
+    img = cp.asarray(batch)
+    print(img.shape)
+    for item in img:
+        print(item.shape)
+
+# (4, 224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (4, 224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+# (224, 224, 3)
+```
+
+#### Experimental Results
+
+We have compared performance against Tifffile for loading the entire image.
+
+##### System Information
+
+- OS: Ubuntu 18.04
+- CPU: [Intel(R) Core(TM) i7-7800X CPU @ 3.50GHz](https://www.cpubenchmark.net/cpu.php?cpu=Intel+Core+i7-7800X+%40+3.50GHz&id=3037), 12 processors.
+- Memory: 64GB (G-Skill DDR4 2133 16GB X 4)
+- Storage
+  - SATA SSD: [Samsung SSD 850 EVO 1TB](https://www.samsung.com/us/computing/memory-storage/solid-state-drives/ssd-850-evo-2-5-sata-iii-1tb-mz-75e1t0b-am/)
+  
+##### Experiment Setup
+
+Benchmarked loading several images with [Tifffile](https://github.com/cgohlke/tifffile).
++ Use read_region() APIs to read the entire image (.svs/.tiff) at the largest resolution level.
+    - Performed on the following images that use a different compression method
+        * JPEG2000 YCbCr: [TUPAC-TR-467.svs](https://drive.google.com/drive/u/0/folders/0B--ztKW0d17XYlBqOXppQmw0M2M), 55MB, 19920x26420, tile size 240x240
+        * JPEG: image.tif (256x256 multi-resolution/tiled TIF conversion of TUPAC-TR-467.svs), 238MB, 19920x26420, tile size 256x256
+        * JPEG2000 RGB: [CMU-1-JP2K-33005.svs](https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/), 126MB, 46000x32893, tile size 240x240
+        * JPEG: [0005f7aaab2800f6170c399693a96917.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data) in [Prostate cANcer graDe Assessment (PANDA) Challenge](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 46MB, 27648x29440, tile size 512x512
+        * JPEG: [000920ad0b612851f8e01bcc880d9b3d.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data) in [Prostate cANcer graDe Assessment (PANDA) Challenge](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 14MB, 15360x13312, tile size 512x512
+        * JPEG: [001d865e65ef5d2579c190a0e0350d8f.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data) in [Prostate cANcer graDe Assessment (PANDA) Challenge](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 71MB, 28672x34560, tile size 512x512
+
++ Use the same number of workers (threads) for both cuCIM and Tifffile.
+    - Tifffile uses half of the available processors by default (6 in the test system)
+    - Tested with 6 and 12 threads
++ Use the average time of 5 samples.
++ Test code is available at [here](https://gist.github.com/gigony/260d152a83519614ca8c46df551f0d57)
+
+##### Results
+
++ JPEG2000 YCbCr: [TUPAC-TR-467.svs](https://drive.google.com/drive/u/0/folders/0B--ztKW0d17XYlBqOXppQmw0M2M), 55MB, 19920x26420, tile size 240x240
+  - cuCIM [6 threads]: 2.7688472287729384
+  - tifffile [6 threads]: 7.4588409311138095
+  - cuCIM [12 threads]: 2.1468488964252175
+  - tifffile [12 threads]: 6.142562598735094
++ JPEG: image.tif (256x256 multi-resolution/tiled TIF conversion of TUPAC-TR-467.svs), 238MB, 19920x26420, tile size 256x256
+  - cuCIM [6 threads]: 0.6951584462076426
+  - tifffile [6 threads]: 1.0252630705013872
+  - cuCIM [12 threads]: 0.5354489935562015
+  - tifffile [12 threads]: 1.5688881931826473
++ JPEG2000 RGB: [CMU-1-JP2K-33005.svs](https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/), 126MB, 46000x32893, tile size 240x240
+  - cuCIM [6 threads]: 9.2361351958476
+  - tifffile [6 threads]: 27.936951795965435
+  - cuCIM [12 threads]: 7.4136177686043085
+  - tifffile [12 threads]: 22.46532293939963
++ JPEG: [0005f7aaab2800f6170c399693a96917.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 46MB, 27648x29440, tile size 512x512
+  - cuCIM [6 threads]: 0.7972335423342883
+  - tifffile [6 threads]: 0.926042037177831
+  - cuCIM [12 threads]: 0.6366931471042335
+  - tifffile [12 threads]: 0.9512427857145667
++ JPEG: [000920ad0b612851f8e01bcc880d9b3d.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 14MB, 15360x13312, tile size 512x512
+  - cuCIM [6 threads]: 0.2257618647068739
+  - tifffile [6 threads]: 0.25579613661393524
+  - cuCIM [12 threads]: 0.1840262952260673
+  - tifffile [12 threads]: 0.2717844221740961
++ JPEG: [001d865e65ef5d2579c190a0e0350d8f.tiff](https://www.kaggle.com/c/prostate-cancer-grade-assessment/data), 71MB, 28672x34560, tile size 512x512
+  - cuCIM [6 threads]: 0.9925791253335774
+  - tifffile [6 threads]: 1.131185239739716
+  - cuCIM [12 threads]: 0.8037087645381689
+  - tifffile [12 threads]: 1.1474561678245663
+
+### 3. Allow CuPy 10
+
+Relaxes version constraints to allow CuPy 10 (in meta.yaml).
+
+`cupy 9.*` => `cupy >=9,<11.0.0a0`
+
+## 🛠️ Improvements
+
+- Add missing imports tests ([#183](https://github.com/rapidsai/cucim/pull/183)) [@Ethyling](https://github.com/Ethyling)
+- Allow installation with CuPy 10 ([#197](https://github.com/rapidsai/cucim/pull/197)) [@glee77](https://github.com/glee77)
+- Upgrade Numpy to 1.18 for Python 3.9 support ([#196](https://github.com/rapidsai/cucim/pull/196)) [@Ethyling](https://github.com/Ethyling)
+- Upgrade Numpy to 1.19 for Python 3.9 support ([#203](https://github.com/rapidsai/cucim/pull/203)) [@Ethyling](https://github.com/Ethyling)
 
 # cuCIM 21.12.00 (9 Dec 2021)