Pretrain openwebtext script with speed monitoring #147

carmocca · 2023-06-14T18:17:32Z

Part of #123

openwebtext with FSDP requires Lightning-AI/pytorch-lightning#17832

Follow-ups:

Add speed monitoring to fine-tuning and the other redpajama pre-training script
Write pretaining howto
XLA support

carmocca · 2023-06-14T18:24:21Z

lit_parrot/speed_monitor.py

+    if not torch.cuda.is_available():
+        return 0


We could make this function get the device type so that it can work for XLA devices too. @gkroiz Do you know where we can get the FLOPs for the different TPU archs?

Hi, https://cloud.google.com/tpu/docs/system-architecture-tpu-vm seems to have some numbers for v2, v3, and v4, but only for bfloat16. Let me see if I can find a document that covers FLOPs for other levels of precision

Leaving this for a follow-up

Pretrain openwebtext script

747f881

carmocca requested review from awaelchli and lantiga as code owners June 14, 2023 18:17

is_accumulating

0f2a3ad

carmocca commented Jun 14, 2023

View reviewed changes

carmocca added 7 commits June 14, 2023 20:28

Step CSV logger

a35d716

Merge branch 'main' into carmocca/pretrain-improvements

0a05a9a

Fix conversion scripts

48094cf

Update test

9975b8a

All working, missing Fabric FSDP patch

3ce6007

One more

da10672

Merge branch 'main' into carmocca/pretrain-improvements

b5a6491

carmocca merged commit 42ca14e into main Jun 15, 2023

carmocca deleted the carmocca/pretrain-improvements branch June 15, 2023 16:04

This was referenced Jun 15, 2023

Integrate the speed monitor in all training scripts #155

Merged

Support non-CUDA devices with SpeedMonitor #156

Merged

gkroiz mentioned this pull request Jun 21, 2023

[TPU] added numbers for TPU FLOPS #186

Merged

carmocca self-assigned this Nov 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretrain openwebtext script with speed monitoring #147

Pretrain openwebtext script with speed monitoring #147

carmocca commented Jun 14, 2023 •

edited

Loading

carmocca Jun 14, 2023

gkroiz Jun 14, 2023

carmocca Jun 15, 2023

Pretrain openwebtext script with speed monitoring #147

Pretrain openwebtext script with speed monitoring #147

Conversation

carmocca commented Jun 14, 2023 • edited Loading

carmocca Jun 14, 2023

Choose a reason for hiding this comment

gkroiz Jun 14, 2023

Choose a reason for hiding this comment

carmocca Jun 15, 2023

Choose a reason for hiding this comment

carmocca commented Jun 14, 2023 •

edited

Loading