add block and grid loop for index_sample kernel to deal with a large-shape tensor #37816

FlyingQianMM · 2021-12-03T03:20:40Z

PR types

Bug fixes

PR changes

OPs

Describe

When the length of input tensor is large than block_dim * grid_dim，the index_sample kernel would not deal with the exceeding part. So we add block and grid loop in the kernel.

…shape tensor

paddle-bot-old · 2021-12-03T03:20:44Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle-bot-old · 2021-12-11T02:36:02Z

Sorry to inform you that 9861e2f's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

… develop_index_sample

thisjiang · 2022-01-19T11:42:03Z

paddle/fluid/operators/index_sample_op.cu

+  unsigned int index_i = blockDim.x * blockIdx.x + threadIdx.x;
+  unsigned int index_j = blockDim.y * blockIdx.y + threadIdx.y;
+  for (; index_j < batch_size; index_j += blockDim.y * gridDim.y) {
+    index_i = blockDim.x * blockIdx.x + threadIdx.x;


这个确定不是冗余的么😂完全没必要重新计算一遍吧？

已删除，感谢～

thisjiang · 2022-01-19T11:42:33Z

paddle/fluid/operators/index_sample_op.cu

+  unsigned int index_j = blockDim.y * blockIdx.y + threadIdx.y;
+
+  for (; index_j < batch_size; index_j += blockDim.y * gridDim.y) {
+    index_i = blockDim.x * blockIdx.x + threadIdx.x;


同上，index_i没必要重计算一遍

已删除，感谢～

thisjiang · 2022-01-19T11:51:36Z

paddle/fluid/operators/index_sample_op.cu

@@ -153,9 +166,16 @@ class IndexSampleGradKernel<platform::CUDADeviceContext, T>
    auto block_height =
        platform::RoundToPowerOfTwo(index_length * batch_size) / block_width;
    dim3 block_dim(block_width, block_height);
+    unsigned int threads = 512;


重复代码，可以提取出来：

void CheckLaunchParamValid(const framework::ExecutionContext& ctx, dim3* block_dim, dim3* grid_dim) { unsigned int threads = 512; block_dim->x = block_dim->x < threads ? block_dim->x : threads; block_dim->y = block_dim->y < threads ? block_dim->y : threads; dim3 max_grid_dim = ctx.template device_context<platform::CUDADeviceContext>() .GetCUDAMaxGridDimSize(); grid_dim->x = grid_dim->x < max_grid_dim.x ? grid_dim->x : max_grid_dim.x; grid_dim->y = grid_dim->y < max_grid_dim.y ? grid_dim->y : max_grid_dim.y; }

然后调用

CheckLaunchParamValid(ctx, &block_dim, &grid_dim);

而非重复写两次。

定义了函数MIN检查block dim，函数LimitGridDim检查grid dim。感谢～

thisjiang

LGTM

add block and grid loop for index_sample kernel to deal with a large-…

9861e2f

…shape tensor

FlyingQianMM closed this Jan 11, 2022

FlyingQianMM reopened this Jan 11, 2022

FlyingQianMM added 2 commits January 11, 2022 08:43

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

bc32edd

… develop_index_sample

fix code format

f375970

thisjiang reviewed Jan 19, 2022

View reviewed changes

limit grid dim

10ecd53

thisjiang approved these changes Jan 21, 2022

View reviewed changes

FlyingQianMM merged commit 4adeff0 into PaddlePaddle:develop Jan 21, 2022

FlyingQianMM mentioned this pull request Feb 18, 2022

add index initialization in the block loop for index_sample kernel when dealing with a input tensor whose shape is larger than block_dim * grid_dim #39736

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add block and grid loop for index_sample kernel to deal with a large-shape tensor #37816

add block and grid loop for index_sample kernel to deal with a large-shape tensor #37816

FlyingQianMM commented Dec 3, 2021

paddle-bot-old bot commented Dec 3, 2021

paddle-bot-old bot commented Dec 11, 2021

thisjiang Jan 19, 2022

FlyingQianMM Jan 20, 2022

thisjiang Jan 19, 2022

FlyingQianMM Jan 20, 2022

thisjiang Jan 19, 2022

FlyingQianMM Jan 20, 2022

thisjiang left a comment

add block and grid loop for index_sample kernel to deal with a large-shape tensor #37816

add block and grid loop for index_sample kernel to deal with a large-shape tensor #37816

Conversation

FlyingQianMM commented Dec 3, 2021

PR types

PR changes

Describe

paddle-bot-old bot commented Dec 3, 2021

paddle-bot-old bot commented Dec 11, 2021

thisjiang Jan 19, 2022

Choose a reason for hiding this comment

FlyingQianMM Jan 20, 2022

Choose a reason for hiding this comment

thisjiang Jan 19, 2022

Choose a reason for hiding this comment

FlyingQianMM Jan 20, 2022

Choose a reason for hiding this comment

thisjiang Jan 19, 2022

Choose a reason for hiding this comment

FlyingQianMM Jan 20, 2022

Choose a reason for hiding this comment

thisjiang left a comment

Choose a reason for hiding this comment