-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: "Only able to place X replicas, but Y replicas were requested" #381
Comments
Hi @spring1915 the client = mii.serve("mistralai/Mistral-7B-Instruct-v0.2", tensor_parallel=2, replica_num=2)
response = client.generate(inputs, max_new_tokens=128) |
Getting below error :
Is deepspeedmii not suitable for single gpu env A40
Please provide help on the same |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I ran
on AWS ml.g5.12xlarge with 4 GPUs on one instance. I got this error
Only able to place 1 replicas, but 2 replicas were requested
. A similar error (Only able to place 1 replicas, but 4 replicas were requested
.) also occurred when I usedclient.generate(inputs, max_new_tokens=128, replica_num=4)
.I used AWS DJL DeepSpeed to run, with this serving.properties file:
model.py
is a customized file, containing the code above and other simple scripts needed when using the DJL server.The text was updated successfully, but these errors were encountered: