The overall code framework is shown in the following figure. It mainly consists of four parts - Config
, Data
, Model
and Network
.
Let us take the train commad python train.py -opt options/train/train_esrgan.json
for example. A sequence of actions will be done after this command.
train.py
is called.- Reads the configuration (a json file) in
options/train/train_esrgan.json
, including the configurations for data loader, network, loss, training strategies and etc. The json file is processed byoptions/options.py
. - Creates the train and validation data loader. The data loader is constructed in
data/__init__.py
according to different data modes. - Creates the model (is constructed in
models/__init__.py
according to differnt model types). A model mainly consists of two parts - [network structure] and [model defination, e.g., loss definition, optimization and etc]. The network is constructed inmodels/network.py
and the detailed structures are inmodels/modules
. - Start to train the model. Other actions like logging, saving intermediate models, validation, updating learning rate and etc are also done during the training.
Moreover, there are utils and userful scripts. A detailed description is provided as follows.
options/
Configure the options for data loader, network structure, model, training strategies and etc.
json
file is used to configure options andoptions/options.py
will convert the json file to python dict.json
file usesnull
forNone
; and supports//
comments, i.e., in each line, contents after the//
will be ignored.- Supports
debug
mode, i.e, model name start withdebug_
will trigger the debug mode. - The configuration file and descriptions can be found in
options
.
data/
A data loader to provide data for training, validation and testing.
- A separate data loader module. You can modify/create data loader to meet your own needs.
- Uses
cv2
package to do image processing, which provides rich operations. - Supports reading files from image folder or
lmdb
file. For faster IO during training, recommand to createlmdb
dataset first. More details including lmdb format, creation and usage can be found in our lmdb wiki. data/util.py
provides useful tools. For example, theMATLAB bicubic
operation; rgb<-->ycbcr as MATLAB. We also provide MATLAB bicubic imresize wiki and Color conversion in SR wiki.- Now, we convert the images to format NCHW, [0,1], RGB, torch float tensor.
models/
Construct different models for training and testing.
- A model mainly consists of two parts - [network structure] and [model defination, e.g., loss definition, optimization and etc]. The network description is in the Network part.
- Based on the
base_model.py
, we define different models, e.g.,SR_model.py
,SRGAN_model.py
,SRRaGAN_model.py
andSFTGAN_ACD_model.py
.
models/modules/
Construct different network architectures.
- The network is constructed in
models/network.py
and the detailed structures are inmodels/modules
. - We provide some useful blocks in
block.py
and it is flexible to construct your network structures with these pre-defined blocks. - You can also easily write your own network architecture in a seperate file like
sft_arch.py
.
utils/
Provide useful utilities.
- logger.py provides logging service during training and testing.
- Support to use tensorboard to visualize and compare training loss, validation PSNR and etc. Installationand usage can be found here.
- progress_bar.py provides a progress bar which can print the progress.
scripts/
Privide useful scripts.
Details can be found here.