You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For our three baselines on different datasets (OSCAR, C4, The Pile), we would like to plot scaling laws and retrieve their coefficients. Specifically, we are looking to reproduce Figure 1 of Scaling Laws for Neural Language Models.
The TensorBoard data for the baseline runs can be retrieved on the Big Science space on HuggingFace: it's the tr3 runs with tensorboard in their name. The naming scheme (tr3b, tr3c, etc.) is explained here.
For C4, we have a XL, L, and M model (tr3, tr3c, tr3c) with short warm-up. For OSCAR and The Pile, we have an XL, L, M, and S model (tr3d, tr3g, tr3h, tr3i and tr3, tr3j, tr3k, tr3l). For OSCAR, we can should also add the 13B run to see if the fits hold (that's tr1-13B).
The text was updated successfully, but these errors were encountered:
For our three baselines on different datasets (OSCAR, C4, The Pile), we would like to plot scaling laws and retrieve their coefficients. Specifically, we are looking to reproduce Figure 1 of Scaling Laws for Neural Language Models.
The TensorBoard data for the baseline runs can be retrieved on the Big Science space on HuggingFace: it's the
tr3
runs withtensorboard
in their name. The naming scheme (tr3b
,tr3c
, etc.) is explained here.For C4, we have a XL, L, and M model (
tr3
,tr3c
,tr3c
) with short warm-up. For OSCAR and The Pile, we have an XL, L, M, and S model (tr3d
,tr3g
,tr3h
,tr3i
andtr3
,tr3j
,tr3k
,tr3l
). For OSCAR, we can should also add the 13B run to see if the fits hold (that'str1-13B
).The text was updated successfully, but these errors were encountered: