Add nn.functional.sparse_attention and some test cases, test=develop #35757

Liu-xiandong · 2021-09-15T07:34:09Z

PR types

New features

PR changes

APIs

Describe

Add paddle.nn.functional.sparse_attention API

本个PR主要将sparse_attention功能在python层进行了一层封装，OP的主体代码见：#PR35676
此外，对于封装的python 接口，增加了相应的单测。

Example

import paddle
import numpy as np

query_data = np.array([[[[0, 1,], [2, 3], [ 0, 1], [2, 3]]]]).astype("float32")
key_data = np.array([[[[0, 1,], [2, 3], [ 0, 1], [2, 3]]]]).astype("float32")
value_data = np.array([[[[0, 1,], [2, 3], [ 0, 1], [2, 3]]]]).astype("float32")
sparse_csr_offset_data = np.array([[[0, 2, 4, 6, 8]]]).astype("int32")
sparse_csr_columns_data = np.array([[[0, 1, 0, 1, 2, 3, 2, 3]]]).astype("int32")
print(query_data.shape)
# (1, 1, 4, 2)
print(sparse_csr_offset_data.shape)
# (1, 1, 5)
print(sparse_csr_columns_data.shape)
# (1, 1, 8)
paddle.disable_static()
query = paddle.to_tensor(query_data, stop_gradient=False, place=paddle.CUDAPlace(0))
key = paddle.to_tensor(key_data, stop_gradient=False, place=paddle.CUDAPlace(0))
value = paddle.to_tensor(value_data, stop_gradient=False, place=paddle.CUDAPlace(0))
offset = paddle.to_tensor(sparse_csr_offset_data, stop_gradient=False, place=paddle.CUDAPlace(0))
columns = paddle.to_tensor(sparse_csr_columns_data, stop_gradient=False, place=paddle.CUDAPlace(0))
output = paddle.nn.functional.sparse_attention(query, key, value, offset, columns)
print(output)

# [[[[1.60885942, 2.60885954],
#       [1.99830270, 2.99830270],
#       [1.60885942, 2.60885954],
#       [1.99830270, 2.99830270]]]]

Result

由于目前CI平台没有CUDA11.2的机器资源，因而将本地计算结果粘贴如下：

本地单测结果
API example结果

paddle-bot-old · 2021-09-15T07:34:14Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… Add_nn_functional_sparse_attention

xingfeng01 · 2021-10-08T02:59:33Z

python/paddle/nn/functional/common.py

@@ -1747,3 +1747,126 @@ class centers and the shape of sampled_class_center will be [num_positive_class_
            'seed': seed if seed is not None else 0
        })
    return remapped_label, sampled_class_center
+
+
+def sparse_attention(query,


建议放到一个单独的 .py 文件里，这个感觉不是很 common

xingfeng01 · 2021-10-08T03:00:30Z

python/paddle/fluid/tests/unittests/test_sparse_attention_op.py

+                paddle_result.numpy(), numpy_result, atol=1e-5))
+
+
+class TestSparseAttentionAPITestFloat(TestSparseAttentionAPI):


非2的幂次方的 shape 可以支持吧？

可以支持，单测中增加了非2的幂次方测试。

xingfeng01 · 2021-10-08T03:02:13Z

python/paddle/nn/functional/common.py

+                     sparse_csr_columns,
+                     name=None):
+    r"""
+    Sparse_attention refers to sparse the Attention matrix in Transformer 


好像语法不是很对。

进行了更改

xingfeng01 · 2021-10-08T03:02:54Z

python/paddle/nn/functional/common.py

+    d represents the size of the last dimension of the three parameters.
+
+    Parameters:
+        query(Tensor): The query tensor in the Attention module. 


检查一下语法，看看有没有可以提高的地方

对文档内容进行了更改

zhangting2020 · 2021-10-09T08:23:36Z

python/paddle/fluid/tests/unittests/test_sparse_attention_op.py

-    not core.is_compiled_with_cuda() or get_suitable_env() == False,
-    "core is not compiled with CUDA and cuda version need >= 11.2 in windows")
+    not core.is_compiled_with_cuda() or get_cuda_version() < 11020,
+    "core is not compiled with CUDA and cuda version need larger than 11.2")


是不是应该改为 larger than or equal to？

zhangting2020 · 2021-10-09T08:26:27Z

python/paddle/nn/functional/sparse_attention.py

+                     sparse_csr_columns,
+                     name=None):
+    r"""
+    This operator implements the sparse_attention api. The api sparse the Attention matrix in Transformer module


30行可以直接改为：This operator sparse the Attention matrix in Transformer module

zhangting2020 · 2021-10-09T08:29:05Z

python/paddle/nn/functional/sparse_attention.py

+        query(Tensor): The query tensor in the Attention module. 
+                        It's a multidimensional tensor with a shape of  
+                        :math:`[batch\_size, num\_heads, target\_len, head\_dim]`. 
+                        The dtype can be ``float32`` and ``float64``.


如果是固定的4-D，描述可以直接指明维度，建议：
query(Tensor): 4-D query Tensor of shape :math:[batch\_size, num\_heads, target\_len, head\_dim]in the Attention module.

limin2021 · 2021-10-09T08:53:48Z

python/paddle/fluid/tests/unittests/test_sparse_attention_op.py

+    def test_static_result(self):
+        paddle.enable_static()
+        with paddle.static.program_guard(paddle.static.Program()):
+            Q = paddle.static.data(name="Q", shape=self.shape, dtype="float32")


It should be dtype=self.dtype

limin2021 · 2021-10-09T08:55:28Z

python/paddle/nn/functional/sparse_attention.py

+    Parameters:
+        query(Tensor): The query tensor in the Attention module. 
+                        It's a multidimensional tensor with a shape of  
+                        :math:`[batch\_size, num\_heads, target\_len, head\_dim]`. 


可以解释下target_len的含义。

改成了seq_len

limin2021 · 2021-10-09T08:56:47Z

python/paddle/fluid/tests/unittests/test_sparse_attention_op.py

+                np.allclose(
+                    fetches_result, expected_result, atol=1e-5))
+
+    def test_dygraph(self):


test_static_result与test_dygraph这两个函数名，看着不像一对，如有必要，可以打磨一下命名。

Done.进行了更改

…addlePaddle#35757) Add paddle.nn.functional.sparse_attention API 本个PR主要将sparse_attention功能在python层进行了一层封装，OP的主体代码见：#PR35676 此外，对于封装的python 接口，增加了相应的单测。

…35757) (#36551) Add paddle.nn.functional.sparse_attention API 本个PR主要将sparse_attention功能在python层进行了一层封装，OP的主体代码见：#PR35676 此外，对于封装的python 接口，增加了相应的单测。

Add nn.functional.sparse_attention and some test cases, test=develop

089e738

Liu-xiandong added 6 commits September 29, 2021 02:16

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

fd26a39

… Add_nn_functional_sparse_attention

modify the CMakeFile

d2c0cc4

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

47828d2

… Add_nn_functional_sparse_attention

Add tests

c21c203

modify filemode

9be4aac

fix bug in compile of windows

3299354

xingfeng01 reviewed Oct 8, 2021

View reviewed changes

Liu-xiandong added 5 commits October 8, 2021 07:27

add tests

1bf2601

modify sparse_attention docs

23d6756

modify sparse_attention docs

64599d1

add required gpu

0ea0f11

skip the CI-Py2 GPU test

a967b86

AnnaTrainingG previously approved these changes Oct 9, 2021

View reviewed changes

zkh2016 previously approved these changes Oct 9, 2021

View reviewed changes

zhangting2020 reviewed Oct 9, 2021

View reviewed changes

limin2021 reviewed Oct 9, 2021

View reviewed changes

modify the English files

b49797d

Liu-xiandong dismissed stale reviews from zkh2016 and AnnaTrainingG via b49797d October 9, 2021 09:31

limin2021 approved these changes Oct 9, 2021

View reviewed changes

xingfeng01 approved these changes Oct 11, 2021

View reviewed changes

AnnaTrainingG approved these changes Oct 11, 2021

View reviewed changes

zhangting2020 approved these changes Oct 11, 2021

View reviewed changes

zkh2016 approved these changes Oct 11, 2021

View reviewed changes

kolinwei approved these changes Oct 11, 2021

View reviewed changes

lanxianghit approved these changes Oct 11, 2021

View reviewed changes

dingjiaweiww approved these changes Oct 11, 2021

View reviewed changes

lanxianghit merged commit 85b7723 into PaddlePaddle:develop Oct 11, 2021

Liu-xiandong mentioned this pull request Oct 14, 2021

[cherry-pick]Add nn sparse attention #36448

Closed

Liu-xiandong mentioned this pull request Oct 19, 2021

Add nn.functional.sparse_attention and some test cases, test=develop … #36551

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add nn.functional.sparse_attention and some test cases, test=develop #35757

Add nn.functional.sparse_attention and some test cases, test=develop #35757

Liu-xiandong commented Sep 15, 2021 •

edited

Loading

paddle-bot-old bot commented Sep 15, 2021

xingfeng01 Oct 8, 2021

xingfeng01 Oct 8, 2021

Liu-xiandong Oct 8, 2021

xingfeng01 Oct 8, 2021

Liu-xiandong Oct 8, 2021

xingfeng01 Oct 8, 2021

Liu-xiandong Oct 8, 2021

zhangting2020 Oct 9, 2021

Liu-xiandong Oct 9, 2021

zhangting2020 Oct 9, 2021

zhangting2020 Oct 9, 2021

Liu-xiandong Oct 9, 2021

limin2021 Oct 9, 2021

Liu-xiandong Oct 9, 2021

limin2021 Oct 9, 2021

Liu-xiandong Oct 9, 2021

limin2021 Oct 9, 2021

Liu-xiandong Oct 9, 2021

		paddle_result.numpy(), numpy_result, atol=1e-5))


		class TestSparseAttentionAPITestFloat(TestSparseAttentionAPI):

Add nn.functional.sparse_attention and some test cases, test=develop #35757

Add nn.functional.sparse_attention and some test cases, test=develop #35757

Conversation

Liu-xiandong commented Sep 15, 2021 • edited Loading

PR types

PR changes

Describe

Example

Result

paddle-bot-old bot commented Sep 15, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Liu-xiandong commented Sep 15, 2021 •

edited

Loading