-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add uintx quant to generate and eval #811
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/811
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d5ebc0e with merge base 317392d (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
i would put the generate/eval results in a table somewhere, if you want to add them to the standard benchmarks you can add them to benchmarks.sh also i would rebase on mine or you will have merge issues |
if eval is broken for you, can you send me the error? |
seems to be fine, it seems that int8wo and bfloat16 are just very close, I thought they were exactly the same before, but there is actually a slight difference |
Summary: att Also rerun the benchmarks/eval for llama2/llama3 to get most recent perf/acc data Test Plan: torchao/_models/llama/generate.py torchao/_models/llama/eval.py Reviewers: Subscribers: Tasks: Tags:
5a4a915
to
d5ebc0e
Compare
right now these are slow, we can add to benchmarks.sh later when the perf is better I think |
Summary: att Also rerun the benchmarks/eval for llama2/llama3 to get most recent perf/acc data Test Plan: torchao/_models/llama/generate.py torchao/_models/llama/eval.py Reviewers: Subscribers: Tasks: Tags:
Summary:
att
Also rerun the benchmarks/eval for llama2/llama3 to get most recent perf/acc data
Test Plan:
torchao/_models/llama/generate.py
torchao/_models/llama/eval.py
llama2:
llama3:
Reviewers:
Subscribers:
Tasks:
Tags: