forked from NVIDIA/Megatron-LM
-
Notifications
You must be signed in to change notification settings - Fork 12
Pull requests: ROCm/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: explicitly set weights_only to False during checkpoint loading to support PyTorch 2.6
#56
opened Feb 9, 2025 by
mpashkovskii
Loading…
feat: add Grok-1 transformer layer and training scripts
#55
opened Feb 7, 2025 by
mpashkovskii
•
Draft
feat: add LoRA adapter layer and Mixtral LoRA training
#53
opened Jan 31, 2025 by
mpashkovskii
Loading…
Add FSDP arguments and example script to train model with FSDP-v2
#52
opened Jan 28, 2025 by
ryang-amd
Loading…
[Perf] Skip creating attention mask in llama dataloader
#40
opened Dec 13, 2024 by
billishyahao
Loading…
ProTip!
What’s not been updated in a month: updated:<2025-01-12.