Fineweb vs Openwebtext: why use Fineweb? #787

shehper · 2024-11-21T20:10:49Z

shehper
Nov 21, 2024

Hey guys, I'm new to llm.c community. Sorry, if this is a naive question, but in comparison with nanoGPT, I noticed that

llm.c uses Fineweb, while nanoGPT was trained on openwebtext.
Secondly, and more importantly, nanoGPT was trained on 300B tokens to get performance at the level of GPT-2, while llm.c achieves this performance with 10B tokens. Why does this discrepancy exist?