Vitaliy/sync foundry tests #32

vchiley · 2022-11-24T00:50:10Z

ports all other tests from llm-foundry

# pytest tests/*
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.9.15, pytest-7.2.0, pluggy-1.0.0
rootdir: /workdisk/vitaliy/benchmarks/llm
collected 13 items                                                                                                                                                        

tests/compare_hf_v_mosaic_gpt.py ....                                                                                                                               [ 15%]
tests/dataloader_tests.py .                                                                                                                                         [ 23%]
tests/model_tests.py ...                                                                                                                                            [ 46%]
tests/test_file_loading_scripts.py ..                                                                                                                               [ 61%]
tests/tokenizer_tests.py .                                                                                                                                          [ 69%]
tests/training_integration_tests.py ..

===================================================================== 13 passed in 104.30s (0:01:44) ======================================================================

lmk if you think we should merge all of these or what we should do with them.

vchiley · 2022-11-28T23:41:54Z

rm unittests

benchmarks/llm# pytest tests/*
====================================================================== test session starts =======================================================================
platform linux -- Python 3.9.15, pytest-7.2.0, pluggy-1.0.0
rootdir: /workdisk/vitaliy/benchmarks/llm
collected 13 items                                                                                                                                               

tests/c4_data_prep_script.py ....                                                                                                                          [ 15%]
tests/compare_hf_v_mosaic_gpt.py ..                                                                                                                        [ 30%]
tests/dataloader_tests.py .                                                                                                                                [ 38%]
tests/model_tests.py ...                                                                                                                                   [ 61%]
tests/tokenizer_tests.py .                                                                                                                                 [ 69%]
tests/training_integration_tests.py ..

================================================================= 13 passed in 104.06s (0:01:44) =================================================================

dblalock · 2022-11-29T03:50:21Z

Tried a fresh checkout of the branch in an interactive instance and the tests choked on my-copy-c4 not being present. Excited to merge the improved testing once this works (or xfails?) though.

root@interactive-a100-40gb-1-v15y-5bm5c:/workspace/vitaliy-benchmarks/llm# pytest tests/* --tb=short
...
tests/c4_data_prep_script.py ....
tests/compare_hf_v_mosaic_gpt.py ..
tests/dataloader_tests.py F
tests/model_tests.py FFF
tests/tokenizer_tests.py .
tests/training_integration_tests.py FF
...
----------------------------------------------------------- Captured stdout call ------------------------------------------------------------
Initializing model...
cfg.n_params=1.64e+06
Building train loader...
========================================================== short test summary info ==========================================================
FAILED tests/dataloader_tests.py::test_correct_padding - FileNotFoundError: [Errno 2] No such file or directory: './my-copy-c4/val/index.json'
FAILED tests/model_tests.py::test_full_forward_and_backward - FileNotFoundError: [Errno 2] No such file or directory: './my-copy-c4/val/index.json'
FAILED tests/model_tests.py::test_attention_mechanism - FileNotFoundError: [Errno 2] No such file or directory: './my-copy-c4/val/index.json'
FAILED tests/model_tests.py::test_full_forward_and_backward_gpt_neo - FileNotFoundError: [Errno 2] No such file or directory: './my-copy-c4/val/index.json'
FAILED tests/training_integration_tests.py::test_train[cpu] - FileNotFoundError: [Errno 2] No such file or directory: './my-copy-c4/val/index.json'
FAILED tests/training_integration_tests.py::test_train[cuda] - FileNotFoundError: [Errno 2] No such file or directory: './my-copy-c4/val/index.json'
================================================== 6 failed, 7 passed in 61.96s (0:01:01) ===================================================

Also, as long as we're changing stuff, could you have all the files with tests start with test_? This way pytest tests/ works, not just pytest tests/*.

vchiley · 2022-11-29T17:35:23Z

updated with xfail

(venv) root@interactive-a100-40gb-1-2loe-dwpsm:/workdisk/vitaliy/benchmarks/llm# pytest tests/*
============================================== test session starts ===============================================
platform linux -- Python 3.9.15, pytest-7.2.0, pluggy-1.0.0
rootdir: /workdisk/vitaliy/benchmarks/llm
collected 13 items                                                                                                

tests/c4_data_prep_script.py ....                                                                           [ 15%]
tests/compare_hf_v_mosaic_gpt.py ..                                                                         [ 30%]
tests/dataloader_tests.py .                                                                                 [ 38%]
tests/model_tests.py ...                                                                                    [ 61%]
tests/tokenizer_tests.py .                                                                                  [ 69%]
tests/training_integration_tests.py ..

========================================= 13 passed in 102.14s (0:01:42) =========================================
(venv) root@interactive-a100-40gb-1-2loe-dwpsm:/workdisk/vitaliy/benchmarks/llm# ls
README.md  __pycache__  assets  convert_c4.py  main.py  my-copy-c4  requirements.txt  src  tests  venv  yamls
(venv) root@interactive-a100-40gb-1-2loe-dwpsm:/workdisk/vitaliy/benchmarks/llm# mv my-copy-c4 tmp_my-copy-c4
(venv) root@interactive-a100-40gb-1-2loe-dwpsm:/workdisk/vitaliy/benchmarks/llm# ls
README.md  __pycache__  assets  convert_c4.py  main.py  requirements.txt  src  tests  tmp_my-copy-c4  venv  yamls
(venv) root@interactive-a100-40gb-1-2loe-dwpsm:/workdisk/vitaliy/benchmarks/llm# pytest tests/*
============================================== test session starts ===============================================
platform linux -- Python 3.9.15, pytest-7.2.0, pluggy-1.0.0
rootdir: /workdisk/vitaliy/benchmarks/llm
collected 13 items                                                                                                

tests/c4_data_prep_script.py ....                                                                           [ 15%]
tests/compare_hf_v_mosaic_gpt.py ..                                                                         [ 30%]
tests/dataloader_tests.py x                                                                                 [ 38%]
tests/model_tests.py ...                                                                                    [ 61%]
tests/tokenizer_tests.py .                                                                                  [ 69%]
tests/training_integration_tests.py xx

==================================== 10 passed, 3 xfailed in 148.99s (0:02:28) ===================================
(venv) root@interactive-a100-40gb-1-2loe-dwpsm:/workdisk/vitaliy/benchmarks/llm#

* ResNet Benchmark (mosaicml#25) Updates the ResNet benchmark to no longer use yahp. Co-authored-by: Matthew <growlix@users.noreply.github.com> Co-authored-by: dblalock <dwb4ke@virginia.edu> * correcting license and headers (mosaicml#29) * Forgot to change branch in resnet benchmark... (mosaicml#30) Forgot to change branch... * Update LLM benchmark with eval, HF models, bugfixes (mosaicml#26) * Ade20k benchmark (mosaicml#27) * Vitaliy/compare hf mosaic (mosaicml#28) * compare mosaic GPT vs HF GPT2 * cleanup * updt abhi cmts Co-authored-by: Vitaliy Chiley <vitaliy@moasic.com> * Vitaliy/sync foundry tests (mosaicml#32) * porting foundry tests * rm unittest * xfail test when dataset is not set up Co-authored-by: Vitaliy Chiley <vitaliy@moasic.com> * CIFAR benchmark (mosaicml#31) * Add cifar benchmark Co-authored-by: dblalock <dwb4ke@virginia.edu> * Add codeowners for each existing benchmark (mosaicml#36) add codeowners for each existing benchmark * fix torch attn init (mosaicml#38) * add init for nn.MultiheadAttention * Bert pre-training and fine-tuning on GLUE (mosaicml#24) Adds benchmark examples for BERT pre-training and fine-tuning on GLUE, and features support for HF models as well as a Mosaic BERT which is implemented and introduced here. See README.md for a detailed description of these components. TODO in a future PR: - README: Add final speedup results - YAMLs: Add configuration files for experiments shown in results - Tests * Add precommit linting + get repo passing all checks (mosaicml#37) This adds the following pre-commit checks: yapf pyright pycln isort inserting our license header into every python file docformatter pydocstyle yamllint yamlfmt This is mostly a copy-pasted .pre-commit-config.yaml, pyproject.toml, and .yamllint.yaml from the streaming repo, along with the associated code autoformating. There was some manual intervention to fix license headers and occasional edge cases where the linters/formatters couldn't all agree (e.g., spacing before a function in another function without a docstring). I also just tightened up some of the docstrings when I was already making them satisfy the linters. * Update flash attn version (mosaicml#40) * Update flash attn version * Update llm/requirements.txt Co-authored-by: Abhi Venigalla <77638579+abhi-mosaic@users.noreply.github.com> Co-authored-by: Abhi Venigalla <77638579+abhi-mosaic@users.noreply.github.com> * Update URL to streaming docs * printing state * change batch number for fast forwarding * fast forward batches properly * Raise error if global batch size not divisible by world size (mosaicml#41) Adding in batchsize error, s.t. a user won't accidentally use an incorrect batchsize * fixed save_overwrite * hard peg version of mosaicml-streaming * set mosaicml-streaming constraint to <0.2.0 * fixed part of bad merge * mostly back to sophia's version * mcli-streaming<0.2.0 * fixed lambada task * attempt to bump streaming version * attempt at using StreamingDataset * used wrong StreamingDataset spec * one more misaligned field * passing shuffle seed to train dataloader * didn't upload changes to data_c4 * updated config Co-authored-by: Landan Seguin <landan@mosaicml.com> Co-authored-by: Matthew <growlix@users.noreply.github.com> Co-authored-by: dblalock <dwb4ke@virginia.edu> Co-authored-by: Vitaliy Chiley <vitaliy@mosaicml.com> Co-authored-by: Jeremy D <115047575+bmosaicml@users.noreply.github.com> Co-authored-by: Austin <A-Jacobson@users.noreply.github.com> Co-authored-by: Vitaliy Chiley <vitaliy@moasic.com> Co-authored-by: dblalock <davis@mosaicml.com> Co-authored-by: Alex Trott <alex@mosaicml.com> Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com> Co-authored-by: Abhi Venigalla <77638579+abhi-mosaic@users.noreply.github.com> Co-authored-by: Bandish Shah <bandish@mosaicml.com> Co-authored-by: bcui19 <bcui8377@gmail.com> Co-authored-by: Sophia Wisdom <sophiawisdom1999@gmail.com>

Vitaliy Chiley and others added 9 commits November 22, 2022 03:21

compare mosaic GPT vs HF GPT2

66e0ac2

updt

15b6eb2

updt

dea4c98

cleanup

f19923f

merge main

67e5bb9

updt after foundry merge

5995ab6

updt abhi cmts

4898178

updt abhi cmts

208725b

porting foundry tests

15324a2

vchiley requested a review from abhi-mosaic November 24, 2022 00:50

vchiley self-assigned this Nov 24, 2022

vchiley requested a review from bmosaicml November 28, 2022 18:38

bmosaicml approved these changes Nov 28, 2022

View reviewed changes

Vitaliy Chiley added 2 commits November 28, 2022 23:34

rm unittest

584adaf

Merge branch 'main' into vitaliy/sync_foundry_tests

8f1a38d

vchiley marked this pull request as ready for review November 28, 2022 23:41

xfail test when dataset is not set up

e97a0c5

vchiley merged commit 01460d9 into main Nov 29, 2022

vchiley deleted the vitaliy/sync_foundry_tests branch November 29, 2022 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vitaliy/sync foundry tests #32

Vitaliy/sync foundry tests #32

Uh oh!

vchiley commented Nov 24, 2022

Uh oh!

vchiley commented Nov 28, 2022

Uh oh!

dblalock commented Nov 29, 2022 •

edited

Loading

Uh oh!

vchiley commented Nov 29, 2022

Uh oh!

Uh oh!

Vitaliy/sync foundry tests #32

Vitaliy/sync foundry tests #32

Uh oh!

Conversation

vchiley commented Nov 24, 2022

Uh oh!

vchiley commented Nov 28, 2022

Uh oh!

dblalock commented Nov 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vchiley commented Nov 29, 2022

Uh oh!

Uh oh!

dblalock commented Nov 29, 2022 •

edited

Loading