Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update on "add selective activation checkpointing"
Selective activation checkpointing (SAC), compared with full AC which always does activation recomputation, selectively stores some intermediate activations to save training time, at the cost of more memory usage. Here are some test results on llama 7B. with full activation checkpointing: - [rank0]: Average iter time: 4.9126 seconds - [rank0]: Peak Memory: Reserved 40.61%, Alloc 28.12%, Active: 29.61% with selective activation checkpointing: - [rank0]: Average iter time: 4.5459 seconds - [rank0]: Peak Memory: Reserved 80.45%, Alloc 62.0%, Active: 63.43% [ghstack-poisoned]
- Loading branch information