Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize Meg CUDA Kernel build system #174

Open
stas00 opened this issue Oct 29, 2021 · 0 comments
Open

Parallelize Meg CUDA Kernel build system #174

stas00 opened this issue Oct 29, 2021 · 0 comments
Labels
Good Difficult Issue For complex tasks Good First Issue Good for newcomers

Comments

@stas00
Copy link
Contributor

stas00 commented Oct 29, 2021

It takes forever to build the Meg cuda kernels as it does it sequentially and doesn't take advantage of multiple cores. It takes some 5 minutes to build. And every time one changes the number of gpus it rebuilds itself, which is both very non-productive and it also makes the CI really slow.

Need to rewrite the build to parallelize it.

Sidenotes: apex and deepspeed have this too, but deepspeed supports make -j

And ideally the solution needs to come from pytorch, perhaps if we solve it generically we could upstream the solution to pytorch core.

@stas00 stas00 added Good First Issue Good for newcomers Good Difficult Issue For complex tasks labels Oct 29, 2021
@stas00 stas00 mentioned this issue Nov 4, 2021
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good Difficult Issue For complex tasks Good First Issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant