Code & data for the EMNLP 2024 paper: Is Child-Directed Speech Effective Training Data for Language Models?
machine-learning natural-language-processing deep-learning transformers language-modeling artificial-intelligence cognitive-science synthetic-data blimp curriculum-learning zorro language-acquisition developmental-psychology roberta gpt-2 gpt-4 child-directed-speech babylm emnlp2024 tinydialogues
-
Updated
Jul 20, 2025 - Python