A library for blind image denoising algorithms using bias free denoising CNN's
The idea is that denoising is a task orthogonal to most medium/high level computer vision tasks and should always be performed beforehand by a fast, independently trained, bias free network. This would enable any medium/high level vision networks to focus on their main task.
My target is to create a series of:
- multi scale
- interpretable
- high performance
- low memory footprint
- fixed budget (feed forward convolutional neural network)
models that performs denoising on an input (grayscale or colored) image.
The bias-free nature of the model allows for easy interpretation and use as prior for hard inverse problems.
Interpretation comes naturally by implementing the CVPR 2020 paper :
"ROBUST AND INTERPRETABLE BLIND IMAGE DENOISING VIA BIAS - FREE CONVOLUTIONAL NEURAL NETWORKS"
This paper provides excellent results
The bias-free nature of the model means that it is completely interpretable as a weighted mask per pixel for each pixel as shown below.
In order to train such a model we corrupt an input image using several types of noise and then try to recover the original image
- subsampling noise
- normally distributed additive noise (same per channel / different same per channel)
- normally distributed multiplicative noise (same per channel / different same per channel)
Currently, we have 3 pretrained models:
- resnet_color_1x6_bn_16x3x3_256x256_l1_relu
- resnet_color_1x12_bn_16x3x3_256x256_l1_relu
- resnet_color_1x18_bn_16x3x3_256x256_l1_relu
They are all resnet
variants with depths 6, 12 and 18.
They were all trained for 20 epochs on KITTI
, Megadepth
, BDD
, WIDER
and WFLW
datasets.
The following samples are 256x256
crops from the KITTI
dataset,
denoised using the resnet_color_1x18_bn_16x3x3_256x256_l1_relu
model.
We add truncated normal noise with different standard deviations and calculate the
Mean Absolute Error (MAE)
both for the noisy images, and the denoised images.
The pixel range is 0-255.
We can clearly see that the model adapts well to different ranges of noise.
- prepare training input
- prepare training configuration
- run training
- export to tflite and saved_model format
- use models
Prepare a training configuration and train with the following command:
python -m bfcnn.train \
--model-directory ${TRAINING_DIR} \
--pipeline-config ${PIPELINE}
Export to frozen graph and/or tflite with the following command:
python -m bfcnn.export \
--checkpoint-directory ${TRAINING_DIR} \
--pipeline-config ${PIPELINE} \
--output-directory ${OUTPUT_DIR} \
--to-tflite
Use any of the pretrained models included in the package.
- resnet_color_1x6_bn_16x3x3_256x256_l1_relu
- resnet_color_1x12_bn_16x3x3_256x256_l1_relu
- resnet_color_1x18_bn_16x3x3_256x256_l1_relu
import bfcnn
import tensorflow as tf
# load model
denoiser_model = \
bfcnn.load_model(
"resnet_color_1x6_bn_16x3x3_256x256_l1_relu")
# create random tensor
input_tensor = \
tf.random.uniform(
shape=[1, 256, 256, 3],
minval=0,
maxval=255,
dtype=tf.int32)
input_tensor = \
tf.cast(
input_tensor,
dtype=tf.uint8)
# run inference
denoised_tensor = denoiser_model(input_tensor)
- Add a small hinge at the MAE loss. 2 (from 255) seems to work very good
- Multiscale models work better, 3-4 scales is ideal.
- Soft-Orthogonal regularization provides better generalization, but it's slower to train.
- Effective Receptive Field regularization provides better generalization, but it's slower to train.
- Squeeze-and-Excite provides a small boost without many additional parameters.
- Avoid Batch Normalization at the end.
- Residual learning (learning the noise) trains faster and gives better metrics but may give out artifacts, so better avoid it.
- Every sample in each batch uses independent forms of noise.
- Selectors block boosts conversion speed and accuracy
All these options are supported in the configuration.
We have used traditional (bias free) architectures.
- resnet
- resnet with sparse constraint
- resnet with on/off per resnet block gates
- all the above models with multi-scale processing
The system is trained in multiple scales by implementing ideas from LapSRN (Laplacian Pyramid Super-Resolution Network) and MS-LapSRN (Multi-Scale Laplacian Pyramid Super-Resolution Network)
By using a gaussian pyramid and a shared bias-free CNN model between each scale, we can ensure that we have a small enough model to run on very small devices while ensure we have a big enough ERF (effective receptive field) for the task at hand.
Our addition (not in the paper) is the laplacian multi-scale pyramid that expands the effective receptive field without the need to add many more layers (keeping it cheap computationally).
Which breaks down the original image into 3 different scales and processes them independently:
We also have the option to add residuals at the end of each processing levels, so it works like an iterative process:
Our addition (not in the paper) is the gaussian multi-scale pyramid that expands the effective receptive field without the need to add many more layers (keeping it cheap computationally).
Every resnet block has the option to include a residual squeeze and excite element (not in the paper) to it.
Our addition (not in the paper) is a (non-channel wise and non-learnable) normalization layer (not BatchNorm) after the DepthWise operations. This is to enforce sparsity with the differentiable relu below.
Our addition (not in the paper) is a differentiable relu for specific operations.
Added optional orthogonality regularization constraint as found in paper Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?. This forces a soft ortho-normal constraint on the kernels.
Custom regularization that forces a soft orthogonal constraint on the kernels while still allowing the kernels to grow independently or shrink to almost zero.
Custom regularization that gives incentive to convolutional kernels to have higher weights away from the center
- Robust and interpretable blind image denoising via bias-free convolutional neural networks
- Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks
- Densely Residual Laplacian Super-Resolution
- Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?
- Squeeze-and-Excitation Networks
tensorflow_graphics
requires that you install the following packages:
libopenexr-dev
python3-dev
I would like to thank Pantelis Georgiades and Alexandros Georgiou from the Cyprus Institute for doing precious hyperparameter search for me on their super computer. Their help accelerated my project enormously.