-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C/PyTorch] Userbuffers and comm+GEMM overlap algorithms refactored and moved to TE/common #1067
base: main
Are you sure you want to change the base?
Commits on Sep 6, 2024
-
moved userbuffers code to TE/common
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for e911bac - Browse repository at this point
Copy the full SHA e911bacView commit details -
moved comm+GEMM overlap code to TE/common
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 4842566 - Browse repository at this point
Copy the full SHA 4842566View commit details -
removed PyTorch depdency from comm+GEMM overlap in TE/common
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for c587e76 - Browse repository at this point
Copy the full SHA c587e76View commit details -
added TE/PyTorch wrappers for refactored comm+GEMM overlap code in TE…
…/common Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 4cc258b - Browse repository at this point
Copy the full SHA 4cc258bView commit details -
updated TE/PyTorch Python API to match the refactored comm+GEMM overl…
…ap code Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for b9370a0 - Browse repository at this point
Copy the full SHA b9370a0View commit details -
updated unit tests to work with refactored comm+GEMM overlap code
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for b03cf2d - Browse repository at this point
Copy the full SHA b03cf2dView commit details -
added a pylint exception to comm+GEMM overlap test runner
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 9994989 - Browse repository at this point
Copy the full SHA 9994989View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for 8c54738 - Browse repository at this point
Copy the full SHA 8c54738View commit details -
Configuration menu - View commit details
-
Copy full SHA for 82a18c0 - Browse repository at this point
Copy the full SHA 82a18c0View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for 29fe3bd - Browse repository at this point
Copy the full SHA 29fe3bdView commit details -
added documentation for te.initialize_ub
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 64ffbbf - Browse repository at this point
Copy the full SHA 64ffbbfView commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for d840201 - Browse repository at this point
Copy the full SHA d840201View commit details -
fixed compile errors when building with NVTE_UB_WITH_MPI=1
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 69ee948 - Browse repository at this point
Copy the full SHA 69ee948View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for f787c4b - Browse repository at this point
Copy the full SHA f787c4bView commit details -
fixed default bootstrap backend
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 3237517 - Browse repository at this point
Copy the full SHA 3237517View commit details -
switched default bootstrap backend priority to MPI > Gloo > NCCL
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 2e6da4d - Browse repository at this point
Copy the full SHA 2e6da4dView commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for aaca26e - Browse repository at this point
Copy the full SHA aaca26eView commit details -
updated bootstrap backend documentation
Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for a04d85a - Browse repository at this point
Copy the full SHA a04d85aView commit details -
close UB bootstrap socket to avoid interfering with CUDA Multicast sh…
…areable file handle send/recv Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for d6f1225 - Browse repository at this point
Copy the full SHA d6f1225View commit details -
added torch::Tensor wrappers for communication buffer and atomic coun…
…ters so PyTorch can factor externally allocated memory into its garbage collection threshold Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 271cbf7 - Browse repository at this point
Copy the full SHA 271cbf7View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for 4586653 - Browse repository at this point
Copy the full SHA 4586653View commit details -
automated handling of world, local and node ranks/sizes within C++ Co…
…mmOverlapHelper to simplify Python function signatures Signed-off-by: Alp Dener <adener@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 23f7dca - Browse repository at this point
Copy the full SHA 23f7dcaView commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for 620c1f9 - Browse repository at this point
Copy the full SHA 620c1f9View commit details