Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update on "Add the option to turn on async-TP"
This PR adds the option to turn on async-TP (`--experimental.enable_async_tensor_parallel`). The feature is currently implemented as compiler passes on relevant patterns, so the option is currently only effective when compile is enabled. Some trace samples from llama3_70b with tp degree=8: **all-gather -> qkv projection** Baseline: <img width="420" alt="image" src="https://github.com/pytorch/torchtitan/assets/4156752/df6980c3-4a2f-4455-bdd3-9079b538123f"> With async-TP: <img width="513" alt="image" src="https://github.com/pytorch/torchtitan/assets/4156752/635c3dee-660d-4452-809b-32620343080a"> **ffn -> reduce-scater** Baseline: <img width="537" alt="image" src="https://github.com/pytorch/torchtitan/assets/4156752/6b045c84-48df-4798-a786-4f57e3f4345a"> With async-TP: <img width="451" alt="image" src="https://github.com/pytorch/torchtitan/assets/4156752/63f13859-97f7-48ea-aef6-4e8861b207ac"> **all-gather -> ffn** Baseline: <img width="494" alt="image" src="https://github.com/pytorch/torchtitan/assets/4156752/b1636055-9b5b-43b1-b98e-b91f06af995e"> With async-TP: <img width="536" alt="image" src="https://github.com/pytorch/torchtitan/assets/4156752/3edaedf4-3780-423d-ba86-5aa1cc5e69df"> [ghstack-poisoned]
- Loading branch information