index.qmd

---
site-url: https://saforem2.github.io/wordplay
website:
  open-graph: true
  description: "wordplay: _playing with words_."
  page-navigation: true
  title: "`wordplay` 🎮 💬"
  # title: "💬 wordplay 🤾"
  # title: "Sam Foreman"
  site-url: "https:saforem2.github.io/wordplay"
  favicon: "https://raw.githubusercontent.com/saforem2/wordplay/main/favicon.svg"
  back-to-top-navigation: true
  repo-url: https://github.com/saforem2/wordplay
  repo-actions: [source, edit, issue]
  google-analytics: G-XVM2Y822Y1
  # sidebar: false
  # twitter-card:
  #   image: "./assets/thumbnail.png"
  #   site: "@saforem2"
  #   creator: "@saforem2"
  twitter-card: true
  navbar:
    title: false
    tools:
      - icon: twitter
        href: https://twitter.com/saforem2
      - icon: github
        menu:
          - text: Source Code
            url: https://github.com/saforem2/wordplay/blob/master/index.qmd
          - text: New Issue
            url: https://github.com/saforem2/wordplay/issues/new/choose
# editor:
#   render-on-save: true
# execute:
#   freeze: true
# metadata-files:
#  - _website.yml
#  - _format.yml
#  # - _reveal.yml
format:
  html: default
  revealjs:
    scrollable: true
    output-file:  "slides.html"
    appearance:
      appearparents: true
    code-line-numbers: false
    code-link: false
    code-copy: false
    # callout-appearance: simple
    # syntax-definitions:
    #   - ./docs/python.xml
    title-block-style: none
    slide-number: c
    title-slide-style: default
    chalkboard:
      buttons: false
    auto-animate: true
    reference-location: section
    touch: true
    pause: false
    footnotes-hover: true
    citations-hover: true
    preview-links: true
    controls-tutorial: true
    controls: false
    logo: "https://raw.githubusercontent.com/saforem2/llm-lunch-talk/main/docs/assets/anl.svg"
    history: false
    highlight-style: "atom-one"
    css:
      - css/default.css
      - css/callouts.css
      - css/code-callout.css
      # - css/callouts-html.css
    theme:
      # - css/common.scss
      # - css/dark.scss
      # - css/syntax-dark.scss
      # - css/slides-dark.scss
      - css/common.scss
      - css/dark.scss
      - css/syntax-dark.scss
      - css/dark-reveal.scss
      # - white
      # - css/light.scss
      # - css/dark-reveal.scss
      # - css/syntax-light.scss
    self-contained: false
    embed-resources: false
    self-contained-math: false
    center: true
    default-image-extension: svg
    code-overflow: scroll
    html-math-method: katex
    fig-align: center
    # mermaid:
    #   theme: dark
    # revealjs-plugins:
    #   - RevealMenu
    # menu:
    #   markers: true
    #   themes:
    #     - name: Dark
    #       theme: css/dark.scss
    #       highlightTheme: css/syntax-dark.scss
    #     - name: Light
    #       theme: css/light.scss
    #       highlightTheme: css/syntax-light.scss
    # themesPath: './docs/css/'
  # gfm:
  #  author: Sam Foreman
  #  output-file: "index.md"
  gfm:
    author: Sam Foreman
    toc: true
    output-file: "README.md"
---

<!-- #  [`wordplay` 🎮 💬]{.title} -->

A set of simple, **scalable** and _highly configurable_ tools for working[^etc]
with LLMs.


> [!IMPORTANT]
> **Getting Started**
> 
> 1. Clone [`saforem2/wordplay`](https://github.com/saforem2/wordplay):
> 
>     ```bash
>     git clone https://github.com/saforem2/wordplay
>     ```
> 
> 2. Setup Python:
> 
>     ```bash
>     PBS_O_WORKDIR=$(pwd) source src/wordplay/bin/helpers.sh
>     setup_conda_polaris
>     ```
> 
> 3. Install deps:
> 
>     ```bash
>     mkdir -p deps/ezpz
>     git clone https://github.com/saforem2/ezpz deps/ezpz
>     python3 -m pip install -e deps/ezpz --require-virtualenv
>     python3  -m pip install -e .
>     ```
> 
> 4. Prepare data:
> 
>     ```bash
>     python3 data/shakespeare_char/prepare.py
>     ```
> 
> 5. Launch:
> 
>     ```bash
>     source deps/ezpz/src/ezpz/bin/savejobenv  # will define a `launch` alias
>     launch python3 -m wordplay +experiment=shakespeare data=shakespeare train.backend=DDP train.max_iters=1000 train.log_interval=5 train.compile=true
>     ```


<details closed><summary>Additional <code>utils</code></summary>:

- Login to node from running job:

    ```bash
    ssh $(qstat -u $USER -Efan | grep x3 | sed "s/\ \ \ //g" | sed 's/\/0\*64//g' | sed "s/+/\n/g" | tail -1)
    ```

- Create `venv`:

    ```bash
    VENV_DIR="venvs/$(echo $CONDA_PREFIX | tr '\/' '\t' | awk '{print $NF}')" && echo "Creating venv in: ${VENV_DIR}" && mkdir -p "${VENV_DIR}" && python3 -m venv "${VENV_DIR}" - -system-site-packages && source "${VENV_DIR}/bin/activate"
    ```

- Submit job:

    ```bash
    qsub -A argonne_tpc -q S1880309 -l select=2 -l walltime=01:00:00,filesystems=eagle:home -I
    ```

</details>


## Background {.scrollable}

What started as some simple
[modifications](https://github.com/saforem2/nanoGPT) to Andrej Karpathy\'s
`nanoGPT` has now grown into the `wordplay` project.

<!-- ::: {#fig-compare gap="5%" layout="[[40,40]]" layout-valign="bottom" style="text-align: center!important;" fig-align="center"} -->
<!-- ::: {layout-ncol=2 gap="5%" layout-valign="bottom"} -->

<!-- :::: {.columns layout-ncol=2 layout-valign="bottom" style="margin-bottom: 4em;" style="text-align:center"} -->

<!-- ::: {layout="[15,-10,15]" layout-valign="bottom"} -->

<!-- :::: {#fig-compare layout-ncol=2 layout-valign="bottom" style="display: flex; align-items: flex-end; text-align:center;"} -->

<!-- ::: {#fig-compare layout="[[40,-5,40]]" layout-valign="center" style="text-align: center;"} -->
<!---->
<!-- ![`nanoGPT`](https://github.com/saforem2/wordplay/blob/main/docs/assets/nanoGPT.png?raw=true){#fig-nanoGPT} -->
<!---->
<!-- ![`wordplay`](https://github.com/saforem2/wordplay/blob/main/docs/assets/wordplay.png?raw=true){#fig-wordplay} -->
<!---->
<!-- Generated using -->
<!-- [prodia/sdxl-stable-diffusion-xl](https://huggingface.co/spaces/prodia/sdxl-stable-diffusion-xl) -->
<!-- on 🤗 HuggingFace. -->
<!-- ::: -->

::: {#fig-compare layout="[[40,40]]" layout-valign="bottom" style="display: flex; align-items: flex-end;"}

![`nanoGPT`](https://github.com/saforem2/wordplay/blob/main/assets/car.png?raw=true){#fig-nanogpt width="256px"}

![`wordplay`](https://github.com/saforem2/wordplay/blob/main/assets/robot.png?raw=true){#fig-wordplay width="150px"}

`nanoGPT`, transformed.
::::

<details closed><summary>If you're curious...</summary>

While `nanoGPT` is a great project and an **excellent** resource; it is, _by
design_, very minimal[^nano] and limited in its flexibility.

Working through the code I found myself making minor changes here and there to
test new ideas and run variations on different experiments. These changes
eventually built to the point where _my_ `{goals, scope, code}` for the project
had diverged significantly from the original vision.

As a result, I figured it made more sense to move things to a new project,
[`wordplay`](https://github.com/saforem2/wordplay).

I've priortized adding functionality that I have found to be useful or
interesting, but am absolutely open to input or suggestions for improvement.

Different aspects of this project have been motivated by some of my recent work
on LLMs.

- Projects:
    - [`ezpz`](https://github.com/saforem2/ezpz): Painless distributed training
      with your favorite `{framework, backend}` combo.
    - [`Megatron-DeepSpeed`](https://github.com/argonne-lcf/Megatron-DeepSpeed):
      Ongoing research training transformer language models at scale, including:
      BERT & GPT-2

- Collaboration(s):
    - **DeepSpeed4Science** (2023-09)
        - [Loooooooong Sequence Lengths](https://samforeman.me/qmd/dsblog)
        - [Project Website](https://www.deepspeed4science.ai/) 
        - [Preprint](https://arxiv.org/abs/2310.04610) @song2023deepspeed4science
        - [Blog Post](https://www.microsoft.com/en-us/research/blog/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies/)
        - [Tutorial](https://www.deepspeed.ai/deepspeed4science/)
    - GenSLMs:
      - [GitHub](https://github.com/ramanathanlab/genslm)
      - [Preprint](https://www.biorxiv.org/content/10.1101/2022.10.10.511571v2)
      - 🏆 [ACM Gordon Bell Special Prize for COVID-19 Research](https://www.acm.org/media-center/2022/november/gordon-bell-special-prize-covid-research-2022)

- Talks / Workshops:
    - **LLM-lunch-talk** (2023-10-12): LLMs at [ALCF](https://alcf.anl.gov).
        - [Slides](https://saforem2.github.io/llm-lunch-talk/#/section)
        - [GitHub](https://github.com/saforem2/llm-lunch-talk)
    - **Creating Small(-ish) LLMs** (2023-11-30)
        - [Workshop](https://github.com/brettin/llm_tutorial/blob/main/tutorials/03-smallish-LLMs/README.md)
        - [Slides](https://saforem2.github.io/LLM-tutorial/#/creating-small-ish-llmsslides-gh)
        - [GitHub](https://github.com/saforem2/LLM-tutorial)

</details>

## Completed

- [x] Work with _any_ 🤗 HuggingFace [dataset](https://huggingface.co/docs/datasets/index)
- [x] Effortless distributed training using [`ezpz`](https://github.com/saforem2/ezpz)
- [x] Improved (type-safe) and extensible configuration system (powered by [`hydra`](https://hydra.cc)), see [#config](#config)
- [x] Automatic, detailed experiment + metric tracking with [Weights \& Biases](https://wandb.ai)
    - [Example Workspace](https://wandb.ai/l2hmc-qcd/WordPlay?workspace=user-saforem2)
    - [Example Run](https://wandb.ai/l2hmc-qcd/WordPlay/runs/in83cm3o/workspace?workspace=user-saforem2)
- [x] [Rich](https://github.com/Textualize/rich) informative logging with [`enrich`](https://github.com/saforem2/enrich)
- [x] [DeepSpeed](https://deepspeed.ai/) support \[~~completed~~: [2024-12-24](https://github.com/saforem2/wordplay/commit/1aec0ec46eb35ab5cf80a9166d7a5c00a862650a)\]


## In Progress

- [ ] [Full-Sharded Data-Parallel (FSDP)](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/) support
    - [Introducing PyTorch Fully Sharded Data Parallel (FSDP) API | PyTorch](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/)
- [ ] 3D Parallelism support via:
    - [Megatron-DeepSpeed](https://github.com/argonne-lcf/Megatron-DeepSpeed)
    - native PyTorch:
        - [Pipeline Parallelism — PyTorch 2.1 documentation](https://pytorch.org/docs/stable/pipeline.html)
        - [pytorch/PiPPy: Pipeline Parallelism for PyTorch](https://github.com/pytorch/PiPPy)


## Install

<details open><summary>Grab-n-Go</summary>

The easiest way to get the most recent version is to:

```bash
python3 -m pip install "git+https://github.com/saforem2/wordplay.git"
```

</details>

<details closed><summary>Development</summary>

If you'd like to work with the project and run / change things yourself, I'd
recommend installing from a local (editable) clone of this repository:

```bash
git clone "https://github.com/saforem2/wordplay"
cd wordplay
mkdir v venv
python3 -m venv venv --system-site-packages
source venv/bin/activate
python3 -m pip install -e .
```

</details>

<!-- # `wordplay` -->
<!---->
<!-- A minimal LLM implementation for research and education. -->


<!-- &title=visitors) -->

<!-- &edge_flat=false) -->

<!-- <p align="center"> -->
<!-- <a href="https://hits.seeyoufarm.com"> -->
<!--     <img align="center" src="https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fsaforem2.github.io%2Fwordplay&count_bg=%2300CCFF&title_bg=%23303030&icon=&icon_color=%23E7E7E7&title=hits&edge_flat=false"/> -->
<!--   </a> -->
<!-- </p> -->

<!-- ## [{{< fa solid hourglass-end >}}]{.pink-text} Last Updated -->

---

```{python}
#| echo: false
import datetime
from rich import print
now = datetime.datetime.now()
day = now.strftime('%m/%d/%Y')
time = now.strftime('%H:%M:%S')
print(' '.join([
  "[dim italic]Last Updated[/]:",
  f"[#F06292]{day}[/]",
  f"[dim]@[/]",
  f"[#1A8FFF]{time}[/]"
]))
```

![](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fsaforem2.github.io%2Fwordplay&count_bg=%23222222&title_bg=%23303030&icon=&icon_color=%23E7E7E7)

[^nano]: `nano`, even 😂

[^etc]:|

    ```json
    {
      "training",
      "fine-tuning",
      "benchmarking",
      "parallelizing",
      "distributing",
      "measuring",
      "..."
    }
    ```

    large models at scale.