Repeatability of Small Model Training Script with fixed seed(s) and same dataset #92

pad9153 · 2024-06-06T18:17:03Z

We observed noticeable variability when re-running the FSDP model training script for a small 1.xB llama2 model with fixed seed(s) and same tokens. Below is a snapshot of the evaluation results on three models created with the same inputs (tokens, training script, seed(s)). Would you please help us investigate the root cause of this variability (data loader, hardware variability or other additional variables)? Thanks in advance!

dangxuanhong · 2024-06-07T00:30:07Z

Yes, above results were from 3 runs of the same yaml file (i.e., same model config, dataset, training params, random-seed, etc.) except for the change of experiment_id. The general setting is:

tokenizer: /cos_ablation/tokenizers/bigcode_starcoder
max_seq_len: 8192
vocab_size: 49152
seed: 42
save_steps: 5000
max_steps: 35000
do_lmeval: True
learning_rate: 6e-4
max_batch_len: 2
num_nodes: 8
use_profiler: "False"
eos_token: "0"
bos_token: "None"
logical_shards: 640

lchu-ibm assigned lchu-ibm and daviswer Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repeatability of Small Model Training Script with fixed seed(s) and same dataset #92

Repeatability of Small Model Training Script with fixed seed(s) and same dataset #92

pad9153 commented Jun 6, 2024

dangxuanhong commented Jun 7, 2024

Repeatability of Small Model Training Script with fixed seed(s) and same dataset #92

Repeatability of Small Model Training Script with fixed seed(s) and same dataset #92

Comments

pad9153 commented Jun 6, 2024

dangxuanhong commented Jun 7, 2024