Preprocessing and feature extraction for raw voice data of DAIC-WOZ
- Run
download.sh
to download the DAIC-WOZ data - Run
python main.py
to preprocess the raw voice and extract features - Run
python daicwoz_label.py
to create labels
Based on the number of seconds listed in the audio transcript file, the participant's voice sections are identified and other sections are silenced to create audio.
Because the number of seconds in the audio transcript file is out of sync, correcting the number of seconds by referring to adbailey1/daic_woz_process.
After this, OpenSMILE features per second and VGGish features are extracted from the preprocessed audio. For VGGish, using harritaylor/torchvggish, a PyTorch implementation.
Combine each CSV of labels provided by DAIC-WOZ to create the labels for model training.