You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kernel for masked_matrix_multiplication (both forward and backward)
Kernel for sparse_softmax (both forward and backward)
Kernel for vector-shape spmm (both forward and backward)
PyTorch wrapper.
Multi-Head
CPU
MXNet
...
Motivation
Current self-attention implementation in DGL is not efficient and uses too much GPU memory.
Custom Op support is required to accelerate some graph operations like masked_mm and sparse_softmax used in the self-attention module.
Alternatives
In the future, there might be elegant solutions but currently, we write custom op for these operations on our own.
Additional context
You may find my primitive custom op implementations here(private repo), note that I've not covered MXNet yet and I hope team members familiar with MXNet would help.
The text was updated successfully, but these errors were encountered:
🚀 Feature
Custom op.
masked_matrix_multiplication
(both forward and backward)sparse_softmax
(both forward and backward)vector-shape spmm
(both forward and backward)Motivation
Current self-attention implementation in DGL is not efficient and uses too much GPU memory.
Custom Op support is required to accelerate some graph operations like
masked_mm
andsparse_softmax
used in the self-attention module.Alternatives
In the future, there might be elegant solutions but currently, we write custom op for these operations on our own.
Additional context
You may find my primitive custom op implementations here(private repo), note that I've not covered MXNet yet and I hope team members familiar with MXNet would help.
The text was updated successfully, but these errors were encountered: