NPC/preprocess at main · Alexander-H-Liu/NPC

History

Name		Name	Last commit message	Last commit date
parent directory ..
phn_split		phn_split
spk_split		spk_split
README.md		README.md
refactor_wsj.py		refactor_wsj.py

README.md

Preparing LibriSpeech

Simply download LibriSpeech from OpenSLR and unzip it. Fill in the path in config file for self-supervised learning with the path to unzipped LibriSpeech.

Preparing WSJ

Download WSJ dataset (requires LDC license)
Download and compile sph2pipe_v2.5 to read WSJ dataset

wget http://www.openslr.org/resources/3/sph2pipe_v2.5.tar.gz
tar xzf sph2pipe_v2.5.tar.gz
cd sph2pipe_v2.5; gcc -o sph2pipe *.c -lm

Refactor (generate wav files and place them all together) WSJ with

python refactor_wsj.py --wsj_root /path/to/downloaded/wsj/ \
                       --dest /path/to/store/new/wsj/

(For phone classification only.) For each utterance, please use Kaldi to obtain force aligment and store the corresponding phone index sequence with torch.save at /path/to/store/new/wsj/phn/fileid.pt (or fileid_nocrop.pt for dev93 split) where fileid.wav can be found at /path/to/store/new/wsj/wav/ after previous step. Last, copy the list of fileid of different splits to he refactored wsj dataset for use with

cp -r phn_split/ /path/to/store/new/wsj/wav/meta/

(For speaker classification only.) The list of fileid & speaker pairs used in different splits are stored at spk/. Copy them to the refactored wsj dataset for use with

cp -r spk_split/ /path/to/store/new/wsj/wav/spk/

Modify the path in config file for downstream tasks to /path/to/store/new/wsj/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preprocess

preprocess

README.md

Preparing LibriSpeech

Preparing WSJ

Files

preprocess

Directory actions

More options

Directory actions

More options

Latest commit

History

preprocess

Folders and files

parent directory

README.md

Preparing LibriSpeech

Preparing WSJ