Some question about cuda thread size. #7081

chengduoZH · 2017-12-27T12:43:18Z

In ForRange struct, thread size seems to be assigned arbitrary value, the value is not multiple of the warp size.
As I read and heard that the thread size assigned to a block should be always multiple of the warp size(32), otherwise not only the remaining part of the warp goes unused and the performance is dropped too since bad memory coalescing. But I didn't find a comparative experiment on this.

Paddle/paddle/platform/for_range.h

Lines 65 to 75 in 7bf47ea

    
           constexpr size_t num_threads = 1024; 
        
           int block_size = limit_ <= num_threads ? limit_ : num_threads; 
        
           int grid_size = (limit_ + num_threads - 1) / num_threads; 
        
           if (grid_size == 1) { 
        
             ForRangeElemwiseOpGridIsOne<<<1, block_size, 0, dev_ctx_.stream()>>>( 
        
                 func); 
        
           } else { 
        
             ForRangeElemwiseOp<<<grid_size, block_size, 0, dev_ctx_.stream()>>>( 
        
                 func, limit_); 
        
           }

typhoonzero · 2017-12-28T03:09:09Z

I'm working on #7045 which I would prefer to add necessary functors for SelectedRows, then the forrange call can be replaced.

chengduoZH · 2017-12-28T06:59:26Z

Currently, forrange is used by adam_op, #6601 also attempts to use forrange.

shanyi15 · 2018-08-15T11:10:33Z

您好，此issue在近一个月内暂无更新，我们将于今天内关闭。若在关闭后您仍需跟进提问，可重新开启此问题，我们将在24小时内回复您。因关闭带来的不便我们深表歉意，请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

chengduoZH assigned reyoung, qingqing01 and hedaoyuan Dec 27, 2017

chengduoZH mentioned this issue Feb 5, 2018

Refine for_range #8152

Closed

shanyi15 closed this as completed Aug 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some question about cuda thread size. #7081

Some question about cuda thread size. #7081

chengduoZH commented Dec 27, 2017 •

edited

Loading

typhoonzero commented Dec 28, 2017

chengduoZH commented Dec 28, 2017

shanyi15 commented Aug 15, 2018

Some question about cuda thread size. #7081

Some question about cuda thread size. #7081

Comments

chengduoZH commented Dec 27, 2017 • edited Loading

typhoonzero commented Dec 28, 2017

chengduoZH commented Dec 28, 2017

shanyi15 commented Aug 15, 2018

chengduoZH commented Dec 27, 2017 •

edited

Loading