-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ds-inference Int8 support through ZeroQuant technology #2217
Commits on Aug 8, 2022
-
Fix the layer-past for GPT based models
Reza Yazdani committedAug 8, 2022 Configuration menu - View commit details
-
Copy full SHA for cf2fe01 - Browse repository at this point
Copy the full SHA cf2fe01View commit details
Commits on Aug 13, 2022
-
add the Int8 support for ds-inference using ZeroQuant technology
Reza Yazdani committedAug 13, 2022 Configuration menu - View commit details
-
Copy full SHA for c2cf304 - Browse repository at this point
Copy the full SHA c2cf304View commit details
Commits on Aug 15, 2022
-
fixing some issue with loading checkpoint and bias-add
Reza Yazdani committedAug 15, 2022 Configuration menu - View commit details
-
Copy full SHA for d98f1f9 - Browse repository at this point
Copy the full SHA d98f1f9View commit details -
adding the logic to store/restore scale for INT8 checkpoint
Reza Yazdani committedAug 15, 2022 Configuration menu - View commit details
-
Copy full SHA for ebc82bb - Browse repository at this point
Copy the full SHA ebc82bbView commit details -
add empty quantization scale for different models to run with fp16
Reza Yazdani committedAug 15, 2022 Configuration menu - View commit details
-
Copy full SHA for 43a7023 - Browse repository at this point
Copy the full SHA 43a7023View commit details -
Reza Yazdani committed
Aug 15, 2022 Configuration menu - View commit details
-
Copy full SHA for 00aa188 - Browse repository at this point
Copy the full SHA 00aa188View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9bed645 - Browse repository at this point
Copy the full SHA 9bed645View commit details
Commits on Aug 18, 2022
-
fix sevral issues after merging with master
Reza Yazdani committedAug 18, 2022 Configuration menu - View commit details
-
Copy full SHA for 84e0d03 - Browse repository at this point
Copy the full SHA 84e0d03View commit details
Commits on Aug 19, 2022
-
several fixes for generating the INT8 sharded checkpoint
Reza Yazdani committedAug 19, 2022 Configuration menu - View commit details
-
Copy full SHA for f6cb028 - Browse repository at this point
Copy the full SHA f6cb028View commit details -
Configuration menu - View commit details
-
Copy full SHA for d47bea6 - Browse repository at this point
Copy the full SHA d47bea6View commit details
Commits on Aug 20, 2022
-
move quantizer declaration before inference branch
Reza Yazdani committedAug 20, 2022 Configuration menu - View commit details
-
Copy full SHA for cb72d9c - Browse repository at this point
Copy the full SHA cb72d9cView commit details
Commits on Aug 24, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 32b9322 - Browse repository at this point
Copy the full SHA 32b9322View commit details -
fixing some part to catch up with latest update on HF side
Reza Yazdani committedAug 24, 2022 Configuration menu - View commit details
-
Copy full SHA for 57779ef - Browse repository at this point
Copy the full SHA 57779efView commit details -
Merge branch 'ds-inference/ZeroQuant-Int8' of github.com:microsoft/De…
…epSpeed into ds-inference/ZeroQuant-Int8
Reza Yazdani committedAug 24, 2022 Configuration menu - View commit details
-
Copy full SHA for f4e48e6 - Browse repository at this point
Copy the full SHA f4e48e6View commit details
Commits on Aug 25, 2022
-
reducing the CPU memory usage when loading checkpoint (this solves th…
…e issue when there is not enough CPU memory to load large models
Reza Yazdani committedAug 25, 2022 Configuration menu - View commit details
-
Copy full SHA for dbcb6ec - Browse repository at this point
Copy the full SHA dbcb6ecView commit details -
some minor modification to the ckpt names
Reza Yazdani committedAug 25, 2022 Configuration menu - View commit details
-
Copy full SHA for cd80ecc - Browse repository at this point
Copy the full SHA cd80eccView commit details
Commits on Aug 26, 2022
-
remove masking and some configuration changes
Reza Yazdani committedAug 26, 2022 Configuration menu - View commit details
-
Copy full SHA for 82a37d6 - Browse repository at this point
Copy the full SHA 82a37d6View commit details -
Reza Yazdani committed
Aug 26, 2022 Configuration menu - View commit details
-
Copy full SHA for 9d12656 - Browse repository at this point
Copy the full SHA 9d12656View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4ae356e - Browse repository at this point
Copy the full SHA 4ae356eView commit details
Commits on Aug 28, 2022
-
Configuration menu - View commit details
-
Copy full SHA for d7ff364 - Browse repository at this point
Copy the full SHA d7ff364View commit details -
fix some issue with int8 ckpt-loading
Reza Yazdani committedAug 28, 2022 Configuration menu - View commit details
-
Copy full SHA for b17a3b5 - Browse repository at this point
Copy the full SHA b17a3b5View commit details
Commits on Aug 29, 2022
-
Configuration menu - View commit details
-
Copy full SHA for a541e52 - Browse repository at this point
Copy the full SHA a541e52View commit details
Commits on Aug 30, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 2845bad - Browse repository at this point
Copy the full SHA 2845badView commit details -
Configuration menu - View commit details
-
Copy full SHA for c77f5e0 - Browse repository at this point
Copy the full SHA c77f5e0View commit details -
change the mp_size to tp_size at inference config & add some doc-stri…
…ng at init_inference
Reza Yazdani committedAug 30, 2022 Configuration menu - View commit details
-
Copy full SHA for f3f4b1d - Browse repository at this point
Copy the full SHA f3f4b1dView commit details