-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Baichuan2 Support #247
Conversation
This looks great, thanks for the PR @AoyuQC. Could you run perplexity on FP16 vs INT4 quantized so we can see how much performance is degraded?
|
Hi, @casper-hansen , I have run perplexity test on FP16 vs INT4. The FP16 version is 6.800 while the INT4 version is 6.938. Please check the following image for experiment results. |
Solid numbers @AoyuQC! And great work :) I have a few questions:
|
Hi @casper-hansen , I have reused |
Add support for baichuan2, issue #50