Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning guide and support of other languages #12

Closed
ValfarDeveloper opened this issue Jan 28, 2025 · 2 comments
Closed

Finetuning guide and support of other languages #12

ValfarDeveloper opened this issue Jan 28, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@ValfarDeveloper
Copy link

Hi! I just want to say thank you for create this SOTA model, good job!

I have a couple of questions:

  • Do you know how can we finetune this model?
  • Exist the possibility of support more languages, can we achieve this with finetuning?
@ValfarDeveloper ValfarDeveloper changed the title Finetuning guide and support of another languages Finetuning guide and support of other languages Jan 28, 2025
@a43992899
Copy link
Collaborator

Actually it supports a wide range of languages. But not all languages are stable. You may test it yourself.

See part of our annealing phase top language distribution here. There are more but the top ones shown below.

Image

@a43992899
Copy link
Collaborator

For fine-tuning, we are working on a usable hugging face fine-tune code.

Probably will show an example of enabling BPM control.

Learning new languages will require large amount of data and compute, since you may need to continual pretrain the stage 1 7B LM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants