Synthetic dataset generation - Brief howto built on top of llama.cpp #3568
paschembri
started this conversation in
Show and tell
Replies: 1 comment 1 reply
-
Suggest you first explore grammer functionality in llama.cpp and the excellent grammer builder helper app. I suspect you'll find them valuable. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I’am currently exploring finetuning smaller models for classification / fill-mask.
I wrote a quick intro on how to produce synthetic dataset in a few lines of python.
It covers:
If you have performance tips for this kind of tasks I’m all ears 👀. Now I have to look on how to implement distilbert using ggml 😱
Beta Was this translation helpful? Give feedback.
All reactions