In this section, we present the different options to configure the framework. In order to use this library, it is necessary to fix, in a configuration file, five parameters: the dataset of images, the kind of problem, the input, the ouput, the generation mode, and the techniques to be applied. Since generating a configuration file might be cumbersome for some users, we have created a to simplify this task.
The dataset of images is given by the path where the images are located.
The kind of problem is either classification, localization, detection, segmentation, instance_segmentation, stackclassification, stackdetection, or stacksegmentation.
Before explaining the input, output and generatio modes, it is important to understand how the images are annotated in each of the four problems. In the case of object classification, each image is labeled with a prefixed category; for object localization, a bounding box indicating the position of the object in the image is provided; for object detection, a list of bounding boxes and the category of the objects inside those boxes are given; finally, in semantic segmentation, each pixel of the image is labeled with the class of its enclosing object.
There are several options to augment a dataset for an object classification problem.
In this mode, the input dataset of images is organized by folders, and the label of an image is given by the name of the containing folder. The output produced is the dataset of augmented images organized with the same structure as the input folder. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is organized by folders, and the label of an image is given by the name of the containing folder. The output produced is the dataset of augmented images stored in an hdf5 file. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is organized by folders, and the label of an image is given by the name of the containing folder. The output produced is the dataset of augmented images stored in an hdf5 file. The generation mode in this case is power. The power mode is a pipeline approach where augmentation techniques are chained together. In this approach, the images produced in one step of the pipeline are added to the dataset that will be fed in the step of the pipeline.
In this mode, the input dataset of images is organized by folders, and the label of an image is given by the name of the containing folder. The output produced is a batch of images that can be fed to Keras. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is given by an image and its annotation using the PascalVOC format. The output produced is the dataset of augmented images together with its annotation using the PascalVOC format. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is given by an image and its annotation using the PascalVOC format. The output produced is the dataset of augmented images stored in an hdf5 file. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is given by an image and its annotation using the PascalVOC format. The output produced is a batch of images that can be fed to Keras. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is given by an image and its annotation using the PascalVOC format. The output produced is the dataset of augmented images together with its annotation using the PascalVOC format. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is given by an image and its annotation using the PascalVOC format. The output produced is a batch of images that can be fed to Keras. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is given by an image and its annotation using the YOLO format. The output produced is the dataset of augmented images together with its annotation using the YOLO format. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is organized by folders, one folder containing the images and other folder containing the annotation images (the names must match). The output produced is the dataset of augmented images organized with the same structure as the input folder. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is organized by folders, one folder containing the images and other folder containing the annotation images (the names must match). The output produced is the dataset of augmented images stored in an hdf5 file. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is organized by folders, one folder containing the images and other folder containing the annotation images (the names must match). The output produced is the dataset of augmented images stored in an hdf5 file. The generation mode in this case is power. The power mode is a pipeline approach where augmentation techniques are chained together. In this approach, the images produced in one step of the pipeline are added to the dataset that will be fed in the step of the pipeline.
In this mode, the input dataset of images is organized by folders, one folder containing the images and other folder containing the annotation images (the names must match). The output produced is a batch of images that can be fed to Keras. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is stored in a folder together with the annotations in the COCO format. The output produced is the dataset of augmented images together with the annotation in the COCO format. The generation mode in this case is linear; that is, given a dataset of n images, and a list of m augmentation techniques, each technique is applied to the n images.
In this mode, the input dataset of images is stored in a folder together with the annotations in the COCO format. The output produced is the dataset of augmented images together with the annotation in the COCO format. The generation mode in this case is sequential; that is, given a dataset of n images and a list of m augmentation techniques, all of the m augmentation techniques are applied to each of the n images. This will result in an output dataset of n images.
In this mode, the input dataset of videos is organized by folders, and the label of a video is given by the name of the containing folder. The output produced is the dataset of augmented videos organized with the same structure as the input folder. The generation mode in this case is linear; that is, given a dataset of n videos, and a list of m augmentation techniques, each technique is applied to the n videos.
In this mode, the input dataset of videos is organized by folders and the annotation is given following the Youtube bb format. The output produced is the dataset of augmented videos organized with the same structure as the input folder and using also the Youtube bb format. The generation mode in this case is linear; that is, given a dataset of n videos, and a list of m augmentation techniques, each technique is applied to the n videos.
In this mode, the input dataset of stacks of images is organized by in two folders one of the stacks of images in tif format and another one with the masks of the stacks also in tif format. The output produced is the dataset of augmented stacks of images organized with the same structure as the input folder. The generation mode in this case is linear; that is, given a dataset of n videos, and a list of m augmentation techniques, each technique is applied to the n videos.
It is possible to add new input-output-generation modes easily as explained in Adding input-output-generation-modes
can be applied to augment the dataset of images.
- Average Blurring
- Bilateral Blurring
- Blurring
- Change color space to HSV
- Change color space to LAB
- Cropping
- Dropout
- Elastic deformations
- Equalize Histogram
- Flip
- Gamma correction
- Gaussian Blurring
- Gaussian Noise
- Invert
- Median Blurring
- None
- Raise blue channel
- Raise green channel
- Raise hue channel
- Raise red channel
- Raise saturation channel
- Raise value channel
- Resize
- Rotate
- Salt and pepper noise
- Sharpen
- Shift channel
- Shearing
- Translation
It is possible to add new techniques easily as explained in Adding input-output-generation-modes