-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.qmd
375 lines (312 loc) · 12.1 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
---
site-url: https://saforem2.github.io/wordplay
website:
open-graph: true
description: "wordplay: _playing with words_."
page-navigation: true
title: "`wordplay` 🎮 💬"
# title: "💬 wordplay 🤾"
# title: "Sam Foreman"
site-url: "https:saforem2.github.io/wordplay"
favicon: "https://raw.githubusercontent.com/saforem2/wordplay/main/favicon.svg"
back-to-top-navigation: true
repo-url: https://github.com/saforem2/wordplay
repo-actions: [source, edit, issue]
google-analytics: G-XVM2Y822Y1
# sidebar: false
# twitter-card:
# image: "./assets/thumbnail.png"
# site: "@saforem2"
# creator: "@saforem2"
twitter-card: true
navbar:
title: false
tools:
- icon: twitter
href: https://twitter.com/saforem2
- icon: github
menu:
- text: Source Code
url: https://github.com/saforem2/wordplay/blob/master/index.qmd
- text: New Issue
url: https://github.com/saforem2/wordplay/issues/new/choose
# editor:
# render-on-save: true
# execute:
# freeze: true
# metadata-files:
# - _website.yml
# - _format.yml
# # - _reveal.yml
format:
html: default
revealjs:
scrollable: true
output-file: "slides.html"
appearance:
appearparents: true
code-line-numbers: false
code-link: false
code-copy: false
# callout-appearance: simple
# syntax-definitions:
# - ./docs/python.xml
title-block-style: none
slide-number: c
title-slide-style: default
chalkboard:
buttons: false
auto-animate: true
reference-location: section
touch: true
pause: false
footnotes-hover: true
citations-hover: true
preview-links: true
controls-tutorial: true
controls: false
logo: "https://raw.githubusercontent.com/saforem2/llm-lunch-talk/main/docs/assets/anl.svg"
history: false
highlight-style: "atom-one"
css:
- css/default.css
- css/callouts.css
- css/code-callout.css
# - css/callouts-html.css
theme:
# - css/common.scss
# - css/dark.scss
# - css/syntax-dark.scss
# - css/slides-dark.scss
- css/common.scss
- css/dark.scss
- css/syntax-dark.scss
- css/dark-reveal.scss
# - white
# - css/light.scss
# - css/dark-reveal.scss
# - css/syntax-light.scss
self-contained: false
embed-resources: false
self-contained-math: false
center: true
default-image-extension: svg
code-overflow: scroll
html-math-method: katex
fig-align: center
# mermaid:
# theme: dark
# revealjs-plugins:
# - RevealMenu
# menu:
# markers: true
# themes:
# - name: Dark
# theme: css/dark.scss
# highlightTheme: css/syntax-dark.scss
# - name: Light
# theme: css/light.scss
# highlightTheme: css/syntax-light.scss
# themesPath: './docs/css/'
# gfm:
# author: Sam Foreman
# output-file: "index.md"
gfm:
author: Sam Foreman
toc: true
output-file: "README.md"
---
<!-- # [`wordplay` 🎮 💬]{.title} -->
A set of simple, **scalable** and _highly configurable_ tools for working[^etc]
with LLMs.
> [!IMPORTANT]
> **Getting Started**
>
> 1. Clone [`saforem2/wordplay`](https://github.com/saforem2/wordplay):
>
> ```bash
> git clone https://github.com/saforem2/wordplay
> ```
>
> 2. Setup Python:
>
> ```bash
> PBS_O_WORKDIR=$(pwd) source src/wordplay/bin/helpers.sh
> setup_conda_polaris
> ```
>
> 3. Install deps:
>
> ```bash
> mkdir -p deps/ezpz
> git clone https://github.com/saforem2/ezpz deps/ezpz
> python3 -m pip install -e deps/ezpz --require-virtualenv
> python3 -m pip install -e .
> ```
>
> 4. Prepare data:
>
> ```bash
> python3 data/shakespeare_char/prepare.py
> ```
>
> 5. Launch:
>
> ```bash
> source deps/ezpz/src/ezpz/bin/savejobenv # will define a `launch` alias
> launch python3 -m wordplay +experiment=shakespeare data=shakespeare train.backend=DDP train.max_iters=1000 train.log_interval=5 train.compile=true
> ```
<details closed><summary>Additional <code>utils</code></summary>:
- Login to node from running job:
```bash
ssh $(qstat -u $USER -Efan | grep x3 | sed "s/\ \ \ //g" | sed 's/\/0\*64//g' | sed "s/+/\n/g" | tail -1)
```
- Create `venv`:
```bash
VENV_DIR="venvs/$(echo $CONDA_PREFIX | tr '\/' '\t' | awk '{print $NF}')" && echo "Creating venv in: ${VENV_DIR}" && mkdir -p "${VENV_DIR}" && python3 -m venv "${VENV_DIR}" - -system-site-packages && source "${VENV_DIR}/bin/activate"
```
- Submit job:
```bash
qsub -A argonne_tpc -q S1880309 -l select=2 -l walltime=01:00:00,filesystems=eagle:home -I
```
</details>
## Background {.scrollable}
What started as some simple
[modifications](https://github.com/saforem2/nanoGPT) to Andrej Karpathy\'s
`nanoGPT` has now grown into the `wordplay` project.
<!-- ::: {#fig-compare gap="5%" layout="[[40,40]]" layout-valign="bottom" style="text-align: center!important;" fig-align="center"} -->
<!-- ::: {layout-ncol=2 gap="5%" layout-valign="bottom"} -->
<!-- :::: {.columns layout-ncol=2 layout-valign="bottom" style="margin-bottom: 4em;" style="text-align:center"} -->
<!-- ::: {layout="[15,-10,15]" layout-valign="bottom"} -->
<!-- :::: {#fig-compare layout-ncol=2 layout-valign="bottom" style="display: flex; align-items: flex-end; text-align:center;"} -->
<!-- ::: {#fig-compare layout="[[40,-5,40]]" layout-valign="center" style="text-align: center;"} -->
<!---->
<!-- ![`nanoGPT`](https://github.com/saforem2/wordplay/blob/main/docs/assets/nanoGPT.png?raw=true){#fig-nanoGPT} -->
<!---->
<!-- ![`wordplay`](https://github.com/saforem2/wordplay/blob/main/docs/assets/wordplay.png?raw=true){#fig-wordplay} -->
<!---->
<!-- Generated using -->
<!-- [prodia/sdxl-stable-diffusion-xl](https://huggingface.co/spaces/prodia/sdxl-stable-diffusion-xl) -->
<!-- on 🤗 HuggingFace. -->
<!-- ::: -->
::: {#fig-compare layout="[[40,40]]" layout-valign="bottom" style="display: flex; align-items: flex-end;"}
![`nanoGPT`](https://github.com/saforem2/wordplay/blob/main/assets/car.png?raw=true){#fig-nanogpt width="256px"}
![`wordplay`](https://github.com/saforem2/wordplay/blob/main/assets/robot.png?raw=true){#fig-wordplay width="150px"}
`nanoGPT`, transformed.
::::
<details closed><summary>If you're curious...</summary>
While `nanoGPT` is a great project and an **excellent** resource; it is, _by
design_, very minimal[^nano] and limited in its flexibility.
Working through the code I found myself making minor changes here and there to
test new ideas and run variations on different experiments. These changes
eventually built to the point where _my_ `{goals, scope, code}` for the project
had diverged significantly from the original vision.
As a result, I figured it made more sense to move things to a new project,
[`wordplay`](https://github.com/saforem2/wordplay).
I've priortized adding functionality that I have found to be useful or
interesting, but am absolutely open to input or suggestions for improvement.
Different aspects of this project have been motivated by some of my recent work
on LLMs.
- Projects:
- [`ezpz`](https://github.com/saforem2/ezpz): Painless distributed training
with your favorite `{framework, backend}` combo.
- [`Megatron-DeepSpeed`](https://github.com/argonne-lcf/Megatron-DeepSpeed):
Ongoing research training transformer language models at scale, including:
BERT & GPT-2
- Collaboration(s):
- **DeepSpeed4Science** (2023-09)
- [Loooooooong Sequence Lengths](https://samforeman.me/qmd/dsblog)
- [Project Website](https://www.deepspeed4science.ai/)
- [Preprint](https://arxiv.org/abs/2310.04610) @song2023deepspeed4science
- [Blog Post](https://www.microsoft.com/en-us/research/blog/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies/)
- [Tutorial](https://www.deepspeed.ai/deepspeed4science/)
- GenSLMs:
- [GitHub](https://github.com/ramanathanlab/genslm)
- [Preprint](https://www.biorxiv.org/content/10.1101/2022.10.10.511571v2)
- 🏆 [ACM Gordon Bell Special Prize for COVID-19 Research](https://www.acm.org/media-center/2022/november/gordon-bell-special-prize-covid-research-2022)
- Talks / Workshops:
- **LLM-lunch-talk** (2023-10-12): LLMs at [ALCF](https://alcf.anl.gov).
- [Slides](https://saforem2.github.io/llm-lunch-talk/#/section)
- [GitHub](https://github.com/saforem2/llm-lunch-talk)
- **Creating Small(-ish) LLMs** (2023-11-30)
- [Workshop](https://github.com/brettin/llm_tutorial/blob/main/tutorials/03-smallish-LLMs/README.md)
- [Slides](https://saforem2.github.io/LLM-tutorial/#/creating-small-ish-llmsslides-gh)
- [GitHub](https://github.com/saforem2/LLM-tutorial)
</details>
## Completed
- [x] Work with _any_ 🤗 HuggingFace [dataset](https://huggingface.co/docs/datasets/index)
- [x] Effortless distributed training using [`ezpz`](https://github.com/saforem2/ezpz)
- [x] Improved (type-safe) and extensible configuration system (powered by [`hydra`](https://hydra.cc)), see [#config](#config)
- [x] Automatic, detailed experiment + metric tracking with [Weights \& Biases](https://wandb.ai)
- [Example Workspace](https://wandb.ai/l2hmc-qcd/WordPlay?workspace=user-saforem2)
- [Example Run](https://wandb.ai/l2hmc-qcd/WordPlay/runs/in83cm3o/workspace?workspace=user-saforem2)
- [x] [Rich](https://github.com/Textualize/rich) informative logging with [`enrich`](https://github.com/saforem2/enrich)
- [x] [DeepSpeed](https://deepspeed.ai/) support \[~~completed~~: [2024-12-24](https://github.com/saforem2/wordplay/commit/1aec0ec46eb35ab5cf80a9166d7a5c00a862650a)\]
## In Progress
- [ ] [Full-Sharded Data-Parallel (FSDP)](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/) support
- [Introducing PyTorch Fully Sharded Data Parallel (FSDP) API | PyTorch](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/)
- [ ] 3D Parallelism support via:
- [Megatron-DeepSpeed](https://github.com/argonne-lcf/Megatron-DeepSpeed)
- native PyTorch:
- [Pipeline Parallelism — PyTorch 2.1 documentation](https://pytorch.org/docs/stable/pipeline.html)
- [pytorch/PiPPy: Pipeline Parallelism for PyTorch](https://github.com/pytorch/PiPPy)
## Install
<details open><summary>Grab-n-Go</summary>
The easiest way to get the most recent version is to:
```bash
python3 -m pip install "git+https://github.com/saforem2/wordplay.git"
```
</details>
<details closed><summary>Development</summary>
If you'd like to work with the project and run / change things yourself, I'd
recommend installing from a local (editable) clone of this repository:
```bash
git clone "https://github.com/saforem2/wordplay"
cd wordplay
mkdir v venv
python3 -m venv venv --system-site-packages
source venv/bin/activate
python3 -m pip install -e .
```
</details>
<!-- # `wordplay` -->
<!---->
<!-- A minimal LLM implementation for research and education. -->
<!-- &title=visitors) -->
<!-- &edge_flat=false) -->
<!-- <p align="center"> -->
<!-- <a href="https://hits.seeyoufarm.com"> -->
<!-- <img align="center" src="https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fsaforem2.github.io%2Fwordplay&count_bg=%2300CCFF&title_bg=%23303030&icon=&icon_color=%23E7E7E7&title=hits&edge_flat=false"/> -->
<!-- </a> -->
<!-- </p> -->
<!-- ## [{{< fa solid hourglass-end >}}]{.pink-text} Last Updated -->
---
```{python}
#| echo: false
import datetime
from rich import print
now = datetime.datetime.now()
day = now.strftime('%m/%d/%Y')
time = now.strftime('%H:%M:%S')
print(' '.join([
"[dim italic]Last Updated[/]:",
f"[#F06292]{day}[/]",
f"[dim]@[/]",
f"[#1A8FFF]{time}[/]"
]))
```
![](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fsaforem2.github.io%2Fwordplay&count_bg=%23222222&title_bg=%23303030&icon=&icon_color=%23E7E7E7)
[^nano]: `nano`, even 😂
[^etc]:|
```json
{
"training",
"fine-tuning",
"benchmarking",
"parallelizing",
"distributing",
"measuring",
"..."
}
```
large models at scale.