Getting audio with contents unrelated to the text input #17
-
I tried to train a model a bit larger than the nano setting with the config |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Yes, the model is too small to train an AR with good performance, as the output of the AR does not have a stable length for output duration and is difficult to get intelligibility. |
Beta Was this translation helpful? Give feedback.
Yes, the model is too small to train an AR with good performance, as the output of the AR does not have a stable length for output duration and is difficult to get intelligibility.
I tried it on Libritts(small) and got such a similar performance that I got some intelligibility on 'short sentence' and 'the first few words of a long sentence', also, with stochastic performance and result.