Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PyTorch] Prototype for operation-based API #707

Merged
merged 74 commits into from
Jul 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
01c22d9
Add basic infrastructure for Sequential module
timmoon10 Feb 15, 2024
8657aad
Add linear op
timmoon10 Feb 16, 2024
2d7f1fe
Add FP8 support in linear op
timmoon10 Feb 27, 2024
169d58d
Add reshape op and unit test
timmoon10 Mar 2, 2024
0808755
Add bias op
timmoon10 Mar 5, 2024
92cdc7a
Add unfused linear op
timmoon10 Mar 5, 2024
e247265
Debug unfused linear op
timmoon10 Mar 5, 2024
0b389eb
Add test for linear+bias op
timmoon10 Mar 6, 2024
7e76476
Add separate abstract classes for unfused and fused ops
timmoon10 Mar 6, 2024
8e5d69c
Consolidate unfused ops in submodule
timmoon10 Mar 6, 2024
e3941dd
Add linear-bias fused op
timmoon10 Mar 6, 2024
21d8100
Use fused cast-transpose in linear ops
timmoon10 Mar 7, 2024
0750bb6
Disable GEMM+bias fusion with FP32 activations
timmoon10 Mar 7, 2024
9f4b5bf
Add parallel unit test for unfused linear op
timmoon10 Mar 8, 2024
96f6023
Refactor parallel tests to reduce job launches
timmoon10 Mar 8, 2024
bc7ca5f
Add all-reduce, all-gather, and reduce-scatter ops
timmoon10 Mar 9, 2024
05bd5c0
Remove unused file
timmoon10 Mar 9, 2024
b9b05f7
Merge branch 'main' into fuser-prototype
timmoon10 Mar 11, 2024
5a49f8e
Debug multi-GPU FP8 test
timmoon10 Mar 12, 2024
c2d9964
Add support for FP8 scale updates
timmoon10 Mar 16, 2024
2a9c90d
Add license boilerplate
timmoon10 Mar 18, 2024
6b3db28
Fuse GEMM+bias in row TP
timmoon10 Mar 19, 2024
57e7055
Rename pipeline to fuser
timmoon10 Mar 19, 2024
ee2bb6a
Tweak documentation
timmoon10 Mar 19, 2024
6f25426
Preserve cached FP8 transpose between ops
timmoon10 Mar 20, 2024
337619b
Add option for fused wgrad accumulation
timmoon10 Mar 20, 2024
5eeef51
Merge branch 'main' into fuser-prototype
timmoon10 Mar 21, 2024
1188d9e
Directly output FP8 from linear if needed
timmoon10 Mar 22, 2024
2c3fb0b
Merge branch 'main' into fuser-prototype-merge
timmoon10 Apr 19, 2024
71fd0bc
Fix cuDNN front-end commit
timmoon10 Apr 19, 2024
018f2d3
Use updated FP8 tensor API for transpose caching
timmoon10 Apr 22, 2024
3f66b77
Use updated API for FP8 scale updates
timmoon10 Apr 24, 2024
0189082
Merge branch 'main' into fuser-prototype
timmoon10 May 6, 2024
f3917dd
Add tests for non-default FP8 recipes
timmoon10 May 6, 2024
98df184
Rename UnfusedOperation to BasicOperation
timmoon10 May 6, 2024
2c3e787
Add unit test to check amax reduction with fusable op
timmoon10 May 7, 2024
6310774
Merge branch 'main' into fuser-prototype
timmoon10 May 7, 2024
3805dd3
Operator autograd state no longer needs to be initialized
timmoon10 May 8, 2024
4316ebb
Initial functional implementation of linear op
timmoon10 May 9, 2024
3896de0
Debug fused linear+bias op
timmoon10 May 9, 2024
b95590b
Remove autograd context from functional linear impl
timmoon10 May 10, 2024
7fe548c
Use functional linear impl in fused linear+bias op
timmoon10 May 10, 2024
70844e0
Merge branch 'main' into fuser-prototype
timmoon10 May 21, 2024
7803357
Rename subdirectory from "fuser" to "ops"
timmoon10 May 21, 2024
5477a8f
Merge branch 'main' into fuser-prototype
timmoon10 May 30, 2024
ca1829e
Update with Float8Tensor changes in #820
timmoon10 May 30, 2024
8e037b9
Remove unnecessary CPU overheads
timmoon10 May 30, 2024
f4f9689
Correctly pass FP8 metadata from next op
timmoon10 May 30, 2024
c8ee677
Merge branch 'main' into fuser-prototype
timmoon10 Jun 1, 2024
738df8a
Fix linter errors
timmoon10 Jun 3, 2024
0d0cfec
Add convenience functions to manipulate Sequential class
timmoon10 Jun 4, 2024
6d46b25
Merge branch 'main' into fuser-prototype
timmoon10 Jun 4, 2024
c092f43
Merge branch 'main' into fuser-prototype
timmoon10 Jun 7, 2024
197c307
Update name of PyTorch extensions module
timmoon10 Jun 7, 2024
63e97b3
Clear saved tensor data in linear op after bprop
timmoon10 Jun 8, 2024
9650924
Fix Pylint error
timmoon10 Jun 8, 2024
0f089e3
Merge branch 'main' into fuser-prototype
timmoon10 Jun 8, 2024
371052d
Update name of PyTorch extensions module
timmoon10 Jun 10, 2024
9d33a81
Merge branch 'main' into fuser-prototype
timmoon10 Jun 10, 2024
bc92e30
Fix test name in QA script
timmoon10 Jun 11, 2024
0783a1f
Merge branch 'main' into fuser-prototype
timmoon10 Jun 11, 2024
1aed75e
Merge branch 'main' into fuser-prototype
timmoon10 Jun 12, 2024
11b6226
Update name of PyTorch extensions module
timmoon10 Jun 12, 2024
56ec4a4
Run distributed tests even when only 1 GPU is available
timmoon10 Jun 13, 2024
ce3c93d
Merge branch 'main' into fuser-prototype
timmoon10 Jun 13, 2024
df30ec8
Merge branch 'main' into fuser-prototype
timmoon10 Jun 13, 2024
fa169b2
Only run distributed tests with 2 GPUs if there are >=2 GPUs
timmoon10 Jun 13, 2024
df98818
Merge branch 'main' into fuser-prototype
timmoon10 Jun 13, 2024
4faec05
Merge branch 'main' into fuser-prototype
timmoon10 Jun 15, 2024
f4e6af9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 15, 2024
6dd8712
Review suggestions from @sudhakarsingh27 and @ksivaman
timmoon10 Jun 27, 2024
9746498
Merge branch 'main' into fuser-prototype
timmoon10 Jun 27, 2024
c5b6ca8
Update transformer_engine/pytorch/ops/__init__.py
timmoon10 Jul 8, 2024
70cce3a
Merge branch 'main' into fuser-prototype
timmoon10 Jul 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions qa/L0_pytorch_unittest/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,5 @@ pytest -v -s $TE_PATH/tests/pytorch/test_gqa.py
pytest -v -s $TE_PATH/tests/pytorch/test_recipe.py
pytest -v -s $TE_PATH/tests/pytorch/test_fused_optimizer.py
pytest -v -s $TE_PATH/tests/pytorch/test_multi_tensor.py
pytest -v -s $TE_PATH/tests/pytorch/test_fusible_ops.py
pytest -v -s $TE_PATH/tests/pytorch/test_fusible_ops_distributed.py
Loading
Loading