Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDATreeLearner: free GPU memory in destructor if any allocated #4963

Merged
merged 6 commits into from
Feb 20, 2022

Conversation

denmoroz
Copy link
Contributor

@denmoroz denmoroz commented Jan 20, 2022

Fixes memory freeing issue described here: allocated GPU memory is not released after training cycle is complete, which leads to constant memory growth while using multiple train() invocations inside the same process (like hyperparameters search).

@ghost
Copy link

ghost commented Jan 20, 2022

CLA assistant check
All CLA requirements met.

@denmoroz
Copy link
Contributor Author

Ups, seems like some tests are not passing as they are cancelled by timeout. Trying to restart.

@denmoroz
Copy link
Contributor Author

denmoroz commented Jan 21, 2022

Still getting
image

🤔

@jmoralez
Copy link
Collaborator

Thank you for your contribution!

Ups, seems like some tests are not passing as they are cancelled by timeout.

That is due to #4948 and will be solved by #4953, once that is merged we'll let you know to update this branch and everything should succeed.

@denmoroz
Copy link
Contributor Author

denmoroz commented Jan 21, 2022

@jmoralez
Oh, I see.

once that is merged we'll let you know to update this branch and everything should succeed

Thanks 🙏

@jmoralez
Copy link
Collaborator

Hi @denmoroz. #4953 has been merged, can you please update this branch to include those changes?

@@ -63,6 +63,43 @@ CUDATreeLearner::CUDATreeLearner(const Config* config)
}

CUDATreeLearner::~CUDATreeLearner() {
#pragma omp parallel for schedule(static, num_gpu_)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a static schedule, and num_gpu_ chunk size, I think there will be only 1 thread being used. So I can we simply remove this line?

Copy link
Contributor Author

@denmoroz denmoroz Feb 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shiyu1994
To be honest I've just copy-pasted code from here which uses omp parallel, so I decided that probably it is done intentionally. My expectations were that cleanup for each GPU device will be executed in parallel for efficiency (if num_gpu_ > 1). Do you think we should actually remove this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After experimenting, I find that schedule(static, num_gpu_) will only result in a single thread. With num_gpu_ block size, and with num_gpu_ total iterations. So this would not parallelize the for loop.
I think it is ok to keep the omp parallel in this PR. And we can fix this with another PR. Thank you!

@denmoroz
Copy link
Contributor Author

denmoroz commented Feb 19, 2022

Hi @denmoroz. #4953 has been merged, can you please update this branch to include those changes?

@jmoralez Sure! Sorry for delayed response, just figured out that you've asked for update.

UPD: Done

Copy link
Collaborator

@shiyu1994 shiyu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your contribution!

@shiyu1994 shiyu1994 merged commit 0db573c into microsoft:master Feb 20, 2022
@jameslamb jameslamb mentioned this pull request Oct 7, 2022
40 tasks
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants