Influence of audio augmentation techniques on STT systems.

This project investigates the impact of audio augmentation techniques on low-resource languages. For this purpose, various models were trained and compared using the same training data, but with different augmentation techniques. The Common Voice dataset was used as training data, which was artificially reduced to 41 hours in order to simulate a low-resource language.

Augmentation comparison

The following table shows the influence of different augmentations on the training process and thus on the resulting model. facebook/wav2vec2-large-xlsr-53 was used as the pretrained model, which was further trained on 41 hours of the CommonVoice dataset, which was doubled by augmentation. In the following table, the CV-MD, CV-SM entries describe reference values that were trained without augmentation. CV-SM describes the used training data set without augmentation and CV-MD a data set with two times the amount of data as CV-SM but without the use of augmentation. Thus, the training data sets are exactly the same size for all entries except CV-SM. The CV-MD entry is supposed to represent a best case and CV-SM the reference value without augmentation.

The two reference values (CV-SM, -MD) and the best augmentation technique will be evaluated relative to other models in the following. It should be noted that the other models were trained on the complete CV data set, unless otherwise described. The models trained on CV-Train were thus trained on about 9 times as much data.

Name		Name	Last commit message	Last commit date
Latest commit History 237 Commits
.idea		.idea
audioengine		audioengine
examples		examples
misc/comparison		misc/comparison
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Influence of audio augmentation techniques on STT systems.

Augmentation comparison

Other - comparison of STT solutions

Datasets

Traditional approaches

End-to-End approaches

References

About

Releases

Packages

Languages

License

NiklasHoltmeyer/Influence-of-audio-augmentation-on-STT

Folders and files

Latest commit

History

Repository files navigation

Influence of audio augmentation techniques on STT systems.

Augmentation comparison

Other - comparison of STT solutions

Datasets

Traditional approaches

End-to-End approaches

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages