-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] (v1.3.9) - ROPE calculator in launcher, please. #375
Comments
Generally this is more of an art than a science. You usually only want to use either NTK-aware (change big number) or linear (change small number) scaling, not both together. For linear, the target is to find the largest number that still results in coherent output. For 2x context, this is 0.5, for 4x context, this is 0.25 and so on For ntk-aware, the target is to find the smallest number that still results in coherent output. This seems to be non-linear, but for 2x it's somewhere around 10000->32000, for 4x maybe about 80k. You may have to trial and error. For more info, refer to ggerganov#2402 but ultimately you need to trial and error.
|
High for linear, low for NTK? That is a useful detail. Thank you, that points me in the right directions. Having rules and structure for the madness is very good. :) |
Not "high" but more like "try to keep it high as it can but still works". The perplexity-to-rope-scale follows a CURVE. Too high or too low will give bad results. |
I have been finding that the default ROPE in KoboldCPP is very unreliable. It takes some tweaking to find the right setting. Problem is, I have to go over to the LlamaCPP github and dig around to find workable settings from people who are trying out ROPE settings.
It would be nice if there is a rope calculator in the launcher, so that I could homebrew scaling myself. An example of scaling that I am using for Airoboros 33b 16k:
0.5 , 70000.
Going from what I saw when trawling the githubs, the big number should be the only one that is changed - apparently that reduces perplexity, being a NTK-aware scaling. Problem is, I don't know how to calculate the scaling.
Jxy's github post has some calculation numbers. Being terrible at math, I don't understand them.
Implement customizable RoPE
The text was updated successfully, but these errors were encountered: