Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

提问微调数据集建构的一些经验。 #406

Open
jizh18 opened this issue Jul 20, 2024 · 0 comments
Open

提问微调数据集建构的一些经验。 #406

jizh18 opened this issue Jul 20, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@jizh18
Copy link

jizh18 commented Jul 20, 2024

好兄弟们,请教大家几个问题,太感谢了
我们现在希望将这个方法迁移到乌干达的小语种上,想问问您怎么去构建数据集会比较好,有这么几个具体的问题。

  1. 数据集的平均长度每一条多长比较合适。

  2. 数据集的lab文件需要对音频的起始和终止时间进行准确标记嘛。

  3. 如果这个语言和英语非常像,只是一些发音上的变化,需要大概多少个小时的数据集会比较合适。

    此致
    敬礼

@jizh18 jizh18 added the enhancement New feature or request label Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant