Datset quality #364

JRMeyer · 2021-03-07T09:21:10Z

JRMeyer
Mar 7, 2021
Maintainer

>>> Thierno_Ibrahima_DIOP
[March 1, 2021, 2:40pm]

Hi, Do you think that these sound quality are enougn to train tacotron
model, I am having bad synthetised audio like these after 40 000
iterations slash
samples from training : slash
slash
slash
slash

synthetised samples (during evaluation): slash
slash
slash

thanks for your help

[This is an archived TTS discussion thread from discourse.mozilla.org/t/datset-quality]

JRMeyer · 2021-03-07T09:21:12Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> nmstoker
[March 2, 2021, 2:24am]

Your files are not accessible to me (and I presume others). There's a
message saying they're blocked by the site owner.

However even with access, I think it would be something of a challenging
question.

A significant factor will be the total quantity of audio you've got to
train with (which you didn't mention) and then a rather intangible
aspect is how consistent the audio is - you can have plenty of good
quality audio and yet if it's too varied and inconsistent then it will
be a struggle to train a model - of course confirming 'what's consistent
enough' is going to be nigh on impossible. Are your transcriptions
accurate or are there potentially errors?

Have you tried training with standard datasets? If so, how did you get
on with them? How does the training you're doing with this data compare
to that data? You often can't make assumptions across datasets but at
least it would give you reassurance that you've got the basics working
and you might be able to see how artificially degrading the quality of a
known dataset impacts the ability to train it, until you get something
approaching your set. Those are just some areas to think about.

[Archived Post]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datset quality #364

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Datset quality #364

JRMeyer Mar 7, 2021 Maintainer

Replies: 1 comment

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer

JRMeyer
Mar 7, 2021
Maintainer Author