How to use this code and do the speaker recognition

>> pip install speechbrain

Firstly, we have

custom_model.py: a well-defined TDNN model by pytorch,

inference.py: use to predict the new .flac audio file ,

inferece_noise.py: use to predict the new .flac audio file with noise,

mini_librispeech_prepare.py: use the prepare the audio file, which will create train.json, valid.json, test.json after working. You can just run train.py and it will run mini_librispeech_prepare.py at first.

train.py: use to do the whole work, and we run it first of all.

data file folder: contain all the noise and audio datas we will use in our project. Then we show the files' tree.

|─content
│  └─best_model
├─data
│  ├─LibriSpeech_SI
│  │  ├─noise
│  │  ├─test
│  │  ├─test-noisy
│  │  └─train
│  │      ├─spk001
│  │      ├─spk002
│  │      ├─spk003
│  │      ├─spk004
│  │      ├─......
│  └─RIRS_NOISES
│      ├─pointsource_noises
│      ├─real_rirs_isotropic_noises
│      └─simulated_rirs
│          ├─largeroom
│          │  ├─Room001
│          │  ├─Room002
│          │  └─......
│          ├─mediumroom
│          │  ├─Room001
│          │  ├─Room002
│          │  ├─......
│          └─smallroom
│              ├─Room001
│              ├─Room002
│              ├─......
├─pretrained_models
│  ├─EncoderClassifier-c7c226823887b6cf7433ddb7e3319813
│  └─EncoderClassifier-fab6304d6f8a8e30333af1f33bbe745c
├─results
│  └─speaker_id
│      └─1986
│          └─save
│              └─CKPT+2022-12-29+04-15-58

content/best_model file folder: save our well-trained model and related .yaml file, which we can use to do the inference/ predict the new audio file.

At beginning,

>> python train.py train.yaml

we will get train.json, valid.json, test.json, which save all training audio files and split them into train, valid, test set automatically. And we will get the filefolder results, which save the pretrained TDNN model and other related files created by our audio files.
Then we write a plot.py file to plot the train_loss and valid_loss according to the train_log.txt file, which records all the changes of the train_loss and valid_loss in our model.

>> python plot.py

Next, according to the model in the file folder results/speaker_id/1986/save/CKPT+2022-12-29+04-15-58+00, which contains model-related configuration files, we change the content/best_model/hparams_inference.yaml, which we will use to do the predictions.
Last, we run the inference.py and inference_noise.py, then we will get 2 .txt flies, which saves our prediction results about the datas in data/LibriSpeech_SI/test and data/LibriSpeech_SI/test-noisy

>> python inference.py
>> python inference_noise.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to use this code and do the speaker recognition

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
content/best_model		content/best_model
data		data
results/speaker_id/1986		results/speaker_id/1986
README.md		README.md
answer.txt		answer.txt
answer_noise.txt		answer_noise.txt
custom_model.py		custom_model.py
inference.py		inference.py
inference_noise.py		inference_noise.py
mini_librispeech_prepare.py		mini_librispeech_prepare.py
test.json		test.json
train.json		train.json
train.py		train.py
train.yaml		train.yaml
tree.txt		tree.txt
valid.json		valid.json

wla-98/speakers_recognition

Folders and files

Latest commit

History

Repository files navigation

How to use this code and do the speaker recognition

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages