Releases: SulRash/minLLMTrain
Releases · SulRash/minLLMTrain
It Works a Lot Better
What's Changed
- Improved Dataset Packing Efficiency by @SulRash in #6
- Fixed a lot of silly bugs
- Improved general script performance.
- Enriched examples further.
Full Changelog: v0.1...v0.2
It Works Well
- Can pretrain an LLM from scratch given a config file.
- Can train through Deepspeed.
- Can train through Fully Sharded Data Parallel.
- Code supports training through Megatron-LM.
- Reliable checkpointing.
- Implements dataset packing for efficient training.
- Easy to understand code base.
A lot more but these are just the highlights.