Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Skip Triton import for AMD (microsoft#5110)
When testing DeepSpeed inference on an `AMD Instinct MI250X/MI250` GPU, the `pytorch-triton-rocm` module would break the `torch.cuda` device API. To address this, importing `triton` is skipped when the GPU is determined to be `AMD`. This change allows DeepSpeed to be executed on an AMD GPU w/o kernel injection in the DeepSpeedExamples [text-generation example](https://github.com/microsoft/DeepSpeedExamples/tree/master/inference/huggingface/text-generation) using the following command: ```bash deepspeed --num_gpus 1 inference-test.py --model facebook/opt-125m ``` TODO: Root-cause the interaction between `pytorch-triton-rocm` and DeepSpeed to understand why this is causing the `torch.cuda` device API to break.
- Loading branch information