Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[PyTorch] Minor optimizations to reduce CPU overheads in modules enhancement New feature or request
#1191 opened Sep 18, 2024 by timmoon10 Loading…
6 of 13 tasks
[PyTorch] Fix detection of 3 in 3hd/h3d layouts
#1187 opened Sep 16, 2024 by cyanguwa Loading…
8 of 13 tasks
[PyTorch] Miscellaneous fixes for FA3 FP8 attention
#1174 opened Sep 10, 2024 by cyanguwa Loading…
9 of 13 tasks
[PyTorch] Fused dbias-cast-transpose in bias operation
#1168 opened Sep 6, 2024 by timmoon10 Loading…
7 of 13 tasks
Fix autocast deprecation warning.
#1167 opened Sep 6, 2024 by jondeaton Loading…
[PyTorch] Activation operations
#1164 opened Sep 6, 2024 by timmoon10 Loading…
6 of 13 tasks
[PyTorch] Avoid saving fp8_tensors in certain scenarios
#1143 opened Aug 28, 2024 by cyanguwa Loading…
8 of 13 tasks
[PyTorch] Userbuffers support in operation-based API
#1142 opened Aug 27, 2024 by timmoon10 Loading…
7 of 13 tasks
Norms Refractor
#1140 opened Aug 27, 2024 by phu0ngng Draft
5 of 13 tasks
Don't save fp8 q/k/v/out tensors when using bf16 bprop
#1139 opened Aug 27, 2024 by guyueh1 Loading…
13 tasks
Fix param input order for cudagraph bug Something isn't working
#1138 opened Aug 27, 2024 by yifeis-nv Loading…
4 of 13 tasks
Add high_precision_init_val to model params when using fp8_model_init
#1121 opened Aug 19, 2024 by kunlunl Loading…
8 of 13 tasks
[PyTorch] Debug CUDA graph support with operation-based API bug Something isn't working
#1117 opened Aug 16, 2024 by timmoon10 Loading…
7 of 13 tasks
[C/PyTorch] Userbuffers and comm+GEMM overlap algorithms refactored and moved to TE/common enhancement New feature or request
#1067 opened Jul 31, 2024 by denera Loading…
8 of 13 tasks
[PyTorch] Debug checkpointing with operation-based API bug Something isn't working
#1063 opened Jul 31, 2024 by timmoon10 Loading…
8 of 13 tasks
Use pyproject.toml to specify build requirements build Build system
#1061 opened Jul 30, 2024 by ksivaman Loading…
6 of 13 tasks
[JAX] Support Ring Attention (Context Parallelism)
#1059 opened Jul 30, 2024 by mingxu1067 Draft
1 of 13 tasks
Change condition for ub tp overlap.
#1055 opened Jul 29, 2024 by Victarry Loading…
1 of 13 tasks
[PyTorch] Normalization ops enhancement New feature or request
#1033 opened Jul 22, 2024 by timmoon10 Loading…
8 of 13 tasks
Flash attention support softcap.
#1013 opened Jul 14, 2024 by Lzhang-hub Loading…
7 tasks
[JAX] Sharding Utils
#1003 opened Jul 9, 2024 by mingxu1067 Draft
8 of 13 tasks
ProTip! Add no:assignee to see everything that’s not assigned.