Journal

2016-11-28

The pipeline now reads compressed (gzip) files instead of raw .wav files. This reduces the space on disk for the data set by a factor of three. An augmentation method called time_shift_signal has been added which will simply split the signal in two at random, and place the second part before the first part. Everything seems to be running well on the GPU, and the memory usage is low and well maintained.

After running a first experiment with 10 mini batches, each with 10 epochs of randomly chosen augmented data, the training loss seems to be slowly decreasing (from ~0.25 to ~0.20). This slower convergence is what we want to achieve in hope of model that will generalize well. However, the validation loss does not seem to be decreasing at all. This may be due to the very low number of mini-batches and epochs, and a longer training instance has been started to see if it will change the results. The training set consists of 5000 randomly chosen, and augmented signal segments, which each mini-batch drawn from at random.

Next implementation:

proper metric, e.g., Area Under Curve, or Mean Average Precision.

Still not in the pipline:

pitch shift augmentation
median filtering in the mask calculations

2016-11-25

Implemented a data generator scheme which should be easy to use, which can be configured to return a set of augmented samples in mini-batches. The mini-baches can then be used to fit the model, for a couple of epochs, in small steps (every mini-batch is fit into memory). It seems to be working, and is running ok on the GPU.

Next steps:

compress data set and read from compressed files if needed
add time augmentation
add pitch shift augmentation

2016-11-25

Implemented a couple of data augmentation generator methods. The agumented samples are now computed using only their filenames and stored as dicts. It should be possible to create a large data set of such unique samples, and then only load the files into memory for each mini-batch that is generated by the augmented data samples set.

make sure that the mini-batch is removed from memory when it has been used
compress the data set using gzip, and decompress on the fly in python
connect the augmented data samples to the training scheme

2016-11-24

Discussed how to load data for augmentation, and whether it makes sense to normalize the data in the spectral domain to zero mean, and identity variance. The former should probably be done on the fly from a compressed data set file on disk. Where each epoch is loaded at random, and same class + noise augmented in real time.

The next step is to write the relevant methods for loading, and choosing the random batches of data. And then augment them on the fly. The augmented files are then used as input to the CNN.

open channel to the compressed data set
load as many segments as possible (memory constraints), at random without replacement, from the compressed data set
randomly augment each segment with:
- three noise segments
- a same class signal segment
- and time/frequency shift it
connect the augmented segments to the network model

2016-11-23

Added mask scaling
Added benchmark file
Added preprocessing step
Fixed bottleneck from preprocessing step

The script pp.py can now preprocess a data set. The script is hardcoded to the mlsp2013 training data, but should generalize to any data set with 16-bit mono wave files with a sample rate of 16000Hz. The script assumes that a file called file2labels.csv is present in the data set directory, and will

read each wave file
mask out the noise and signal part of the wave file
split the noise and signal parts into equally sized segments
save the segments in the specified output directory
create a new file2labels.csv file in this directory with the labels for each signal segment

I have gotten first hand experience with for loop bottlen necks, and simply by removing the for loops in the compute_binary_mask method, and replace them with numpy array operation I got a speedup of around 30x in the wave file preprocessing.

Daily take away: Do not use iterative for loops

Do tomorrow:

Create a couple of preprocessing images and compare to Mario nips2013
Fix the scaling error in reshape_binary_mask
Start with data augmentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Journal

2016-11-28

2016-11-25

2016-11-25

2016-11-24

2016-11-23

Clone this wiki locally