๐ค AI researcher working on multilingual NLP, smol language modelling, huge language models, and educational AI! Previously helped build ALLaM, a state-of-the-art Arabic-English language model :)
- SmolTulu - Highest performing sub 2B model on reasoning benchmarks - An investigation into learning rate & batch size ratios
- Fineweb-Edu-Ar - Largest open-source machine translated Arabic educational dataset
- ALLaM - State-of-the-art Arabic-English LLM
- When Benchmarks are Targets - Analysis of LLM evaluation sensitivity (ACL 2024)
- ๐ huggingface-text-data-analyzer - Comprehensive tool for analyzing HF datasets
- ๐ฎ EasyRogue - Command-line roguelike + RL testbed
- ๐งช minLLMTrain - Minimal LLM training codebase
- ๐ค Next-Token Agent - Training tiny LMs to play ASCII games
- ๐ฑ Environment Encoder - VLM-based RL environment encoding (stale)
- I love dungeons and dragons!
- I've been trying to learn the game engine, Godot.
- I enjoy making music, in my spare time, on the piano.
๐ Let's connect! Find me on LinkedIn!