Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add device map config #331

Merged
merged 7 commits into from
Dec 15, 2023
Merged

Add device map config #331

merged 7 commits into from
Dec 15, 2023

Conversation

mrwyattii
Copy link
Contributor

In MII-Legacy we had deploy_rank which would allow us to specify which GPUs to deploy a model to. This did not compose well with multiple replicas, so we I've refactored that code and brought it into the latest MII.

Here we had a device_map to the config that allows users to specify which devices they want to deploy a model to for the persistent deployment (mii.serve). This works with multi-replica and multi-node cases. We can provide the following types to device_map:

  • int: device_map = 1 - deploy single GPU model to GPU1
  • List[int]: device_map = [2,3] - deploy a 2 GPU model to GPU2 and GPU3
  • List[List[int]]: device_map = [[0,2],[1,3]] - deploy 2 dual-GPU replicas, one to GPU0 & GPU2, the other to GPU1 & GPU3
  • Dict[str,List[List[int]]]: device_map = {"host0":[[0,1],[2,3]], "host1":[[0,1],[2,3]]} - deploy 4 dual-GPU replicas across 2 nodes

The default value is "auto" which will automatically place models / replicas across devices and nodes. Users must still specify the proper replica_num and tensor_parallel values, and these values must match with the device map provided. The device map is not required and is only needed when the non-default model/replica placement is not desired.

resolves #283

@mrwyattii mrwyattii marked this pull request as ready for review November 29, 2023 00:51
@mrwyattii mrwyattii merged commit 5eac7a9 into main Dec 15, 2023
@mrwyattii mrwyattii deleted the mrwyattii/deploy-rank-refactor branch December 15, 2023 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to select specific gpu index when using tensor parallel?
2 participants