This is documentation of several experiments and various techniques
- Channelwise multiplier works well but regularization needs to be an order higher than the conv layers
- Multiplier gives a slightly better results if combined with Channelwise but regularization needs to be an order higher than Channelwise
- SoftOrthogonal works, but we need to find correct parameters, it is very slow to train
- Erf works well by giving incentive to convolutions to spread out
- probabilistic_drop_off of gradients does not work
- delta mae produces better edges but misses flat regions (did not run until the end)
- ReLU seems to work very well
- ELU (alone) does not produce better results than ReLU but produces lesser artifacts at very high noise levels
- ReLU6 seems to work better than ReLU and provides better regularization
- squashing feature space increases MAE
architecture: resnet
- depth: 1x6
- filters: 32x64x32
- kernels: 1x3x1
- resolution: 256x256
- extra:
- channelwise -> with 0.001 turns off around 16 feature maps completely
- erf: l1 0.025
- relu
- batchnorm
- parameters: 135k
results
- mae: 3.5
- snr: 7.1db
indication for higher channelwise regularization