fix fp16 Qwen2 series model to DeepSpeed-FastGen #6028

ZonePG · 2024-08-17T02:06:18Z

based on PR #5403 (Qwen1.5-MOE) and #5219 (Qwen1.5), support Qwen2 series model.

including: 0.5B, 1.5B, 7B, 57B-A14B, and 72B models.

tohtana

I validated that this works on my end. Thank you for your great contribution!

fix Qwen2 serial model to DeepSpeed-FastGen

a9b82bc

ZonePG requested review from awan-10 and arashb as code owners August 17, 2024 02:06

Merge branch 'master' into master

4e0b6fc

loadams requested review from tohtana and removed request for arashb August 21, 2024 16:33

tohtana approved these changes Aug 21, 2024

View reviewed changes

tohtana added this pull request to the merge queue Aug 21, 2024

Merged via the queue into microsoft:master with commit e6fcc22 Aug 22, 2024
7 checks passed

Provide feedback