-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subclass API #966
Subclass API #966
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/966
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New Failure, 2 Unrelated FailuresAs of commit 8d53959 with merge base 60ffb86 (): BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D62464487 |
Summary: Pull Request resolved: pytorch#966 Adds new int8_dynamic_activation_intx_weight quantization with subclass API Differential Revision: D62464487
This pull request was exported from Phabricator. Differential Revision: D62464487 |
8b7c8fb
to
8d53959
Compare
Summary: Adds new int8_dynamic_activation_intx_weight quantization with subclass API Differential Revision: D62464487
Differential Revision: D62464487 Pull Request resolved: #995
* add llama 3.1 8b support * make Model and ModelArgs as model definition entrance * make model definition support multiple transformer * make model definition support multiple transformer * make model definition support multiple transformer * make input arg static in Model to support export * fix bugs for gguf and et in new model definition architecture * retrieve text transformer arg from modelargs * add set_cache funtion to Model to work around PTEModel issue * make torchchat rely on torchtune * remove export_util * extra torchtune dependency
Summary: Adds new int8_dynamic_activation_intx_weight quantization with subclass API
Differential Revision: D62464487