Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于finetune的方案 #7

Open
AdamBear opened this issue Mar 11, 2022 · 3 comments
Open

关于finetune的方案 #7

AdamBear opened this issue Mar 11, 2022 · 3 comments

Comments

@AdamBear
Copy link

看你提交记录里有修复过fastspeech2在finetune时的bug,能不能说明一下finetune的方法?
每次加新录音要全部重新训练太慢了

@jerryuhoo
Copy link
Owner

这个finetune指的是对vocoder的finetune,一般来说如果用speedyspeech模型训练的话,如果不finetune vocoder效果会很差,可以参考PaddleSpeech仓库的finetune方法。然后你说的每次加一个人的声音就要全部重新训练,目前来说确实只能这样,除非更改模型结构,比如说去掉speaker embedding,替换成reference audio的结构,不过我还没试验过。

@AdamBear
Copy link
Author

非常感谢!

@jerryuhoo jerryuhoo reopened this May 16, 2022
@jerryuhoo
Copy link
Owner

目前可行的finetune方案是先下载paddlespeech examples中aishell3的预训练模型和aishell3的数据集,然后把aishell3中随便一个人的文件夹替换为自己做的数据集,自己做了几个数据集就替换几个人的,总的人数保持不变。注意根据Readme里,在normalize之前,需要将生成的phone_id_map.txt替换为已有模型的音素词典,不然phone ip映射错误,对训练的发音产生影响。替换后也需要再finetune vocoder才有更好的效果。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants