Skip to content

Getting audio with contents unrelated to the text input #17

Answered by MisakaMikoto96
jry-king asked this question in Q&A
Discussion options

You must be logged in to vote

Yes, the model is too small to train an AR with good performance, as the output of the AR does not have a stable length for output duration and is difficult to get intelligibility.
I tried it on Libritts(small) and got such a similar performance that I got some intelligibility on 'short sentence' and 'the first few words of a long sentence', also, with stochastic performance and result.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@agupta54
Comment options

Answer selected by jry-king
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants