-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Deepspeed inference stage 3 + quantization #5398
Comments
Hello, I successfully used the zero3 stage acceleration for deepspeed inference and used two gpus, but I found that using two gpus did not perform data parallelism or model parallelism. Each apus asked the same question. How should I solve this problem? Thank you for your reply. this is code: if args.cpu_offload and args.nvme_offload_path:raise ValueError("Use one of --cpu_offload or --nvme_offload_path and not both")if args.cpu_offload: dschf = HfDeepSpeedConfig(ds_config) # this tells from_pretrained to instantiate directly on gpus |
When the model is quantized, the hidden sizes cannot be determined from `ds_shape` and `shape`, because they are 1 dimensional. This PR fixes the bug by determining hidden sizes from `in_features` and `out_features`. This PR fixes #5398 Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
trying to set deepspeed with inference zero 3 as follows:
With config.json the deepspeed config file as follows:
and the error as follows:
I tried setting the group_dim to 2 in the config.json, but this gave the error that the tuple was out of range.
My GPU doesn’t support BFloat type format. It is by default disabled I thought, but specifically set the dtype to fp32 (not that this changes anything). How can I fix this issue? The reason why I would like to use zero stage 3 is to go from a 2B model to a 7B model, offloading to CPU (also to test this out a bit).
The text was updated successfully, but these errors were encountered: