Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cutlass] Sparse conv3d backward fusion #52361

Merged
merged 32 commits into from
Apr 13, 2023

Conversation

umiswing
Copy link
Member

@umiswing umiswing commented Mar 30, 2023

PR types

Performance optimization

PR changes

Others

Describe

1.在sparse/gpu/conv_grad_kernel.cu中添加了计算d_kernel, d_x时的gather_gemm_scatter融合。
2.在生成脚本中添加了sparse conv3d反向融合需要用到的kernel的生成代码。
3.在auto tune中添加反向融合接口。
4.cutlass提供的算子融合在反向时需要将gemm计算结果写入buffer中,一次性分配大小为sizeof(float) * max_in_channels * max_out_channels * max_splitk_slices = 4 * 256 * 256 *256 bytes = 67MB的buffer。若训练过程中需要更大的buffer则更新。buffer将在训练结束后释放。

相比develop版本,添加反向融合后,4卡a100上训练性能提高5%。

  PaddlePaddle PyTorch PyTorch / PaddlePaddle
A100-40G 21h -> 20h 22h 1.1



@paddle-bot
Copy link

paddle-bot bot commented Mar 30, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot
Copy link

paddle-bot bot commented Mar 30, 2023

❌ The PR is not created using PR's template. You can refer to this Demo.
Please use PR's template, it helps save our maintainers' time so that more developers get helped.

@umiswing umiswing changed the title Back fusion [cutlass] Sparse conv3d backward fusion Mar 30, 2023
Copy link
Contributor

@JamesLim-sy JamesLim-sy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@zyfncg zyfncg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for static-check-ci

@zkh2016 zkh2016 merged commit 0b98d1a into PaddlePaddle:develop Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants