torch.roll in DeepSpeed #1693
Unanswered
sarvghotra
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
By any chance, does DeepSpeed have an optimized implementation (CUDA) for
torch.roll
function?If not, can you please provide some pointers? Context: I am trying to implement cyclic shift in Shifted Window-Multi head Self Attention (SW-MSA) from Swin Transformer paper by reusing DeepSpeed's kernel code here.
Beta Was this translation helpful? Give feedback.
All reactions