Update on "[BE][4/n] split pipeline_llama into a separate file"

- `parallelize_llama.py` has become a almost 600 lines file, which is too big to read. Let's separate the PP part out, as it does not interact with the rest spmd code. - Making `ParallelDims` into a standalone file, as it's used by `parallelize_llama.py` and `pipeline_llama.py`. - Renaming `float8_linear.py` to `float8.py` and adding it to the `README.md` - Some other minor changes. [ghstack-poisoned]
pytorch · Aug 4, 2024 · 3c140b5 · 3c140b5
1 parent 1353c0f
commit 3c140b5
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 2 deletions.
diff --git a/estimation.py b/estimation.py
@@ -16,7 +16,7 @@
 
 from torchtitan.config_manager import JobConfig
 from torchtitan.datasets import build_tokenizer
-from torchtitan.float8_linear import Float8Handler
+from torchtitan.float8 import Float8Handler
 from torchtitan.logging import init_logger, logger
 from torchtitan.models import model_name_to_cls, model_name_to_tokenizer, models_config
 from torchtitan.optimizer import build_lr_schedulers, build_optimizers

diff --git a/train.py b/train.py
@@ -10,8 +10,9 @@
 from datetime import timedelta
 
 import torch
-import torchtitan.utils as utils
 from torch.distributed.elastic.multiprocessing.errors import record
+
+from torchtitan import utils
 from torchtitan.checkpoint import CheckpointManager, TrainState
 from torchtitan.config_manager import JobConfig
 from torchtitan.datasets import build_hf_data_loader, build_tokenizer