Fix memory leak in inducing point allocators #1890

esantorella · 2023-06-16T16:28:19Z

Fixes #1788 .

Motivation

allocate_inducing_points leaks memory when passed a kernel_matrix with requires_grad=True. The memory leak happens due to a specific pattern of in-place torch operations in _pivoted_cholesky_init; see this comment for more explanation. There is no need for allocate_inducing_points to support a kernel_matrix with requires_grad=True, because the output of allocate_inducing_points is not differentiable anyway (thanks to in-place operations).

[x] make _pivoted_cholesky_init raise an UnsupportedError when passed a kernel_matrix with requires_grad=True. That is mildly BC-breaking, but I think that is okay since the alternative is a memory leak.
[x] Evaluate kernels with torch.no_grad() where they are only used to be passed to _pivoted_cholesky_init

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

[x] Unit test for memory leak
[x] Unit test for UnsupportedError

facebook-github-bot · 2023-06-16T16:34:47Z

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Balandat

thanks for the memory leak fix!

Balandat · 2023-06-16T16:32:38Z

botorch/models/approximate_gp.py

@@ -192,7 +192,7 @@ def __init__(
        Args:
            train_X: Training inputs (due to the ability of the SVGP to sub-sample
                this does not have to be all of the training inputs).
-            train_Y: Training targets (optional).
+            train_Y: Not used.


interesting - why did this have the arg in the first place? Any reason (other than linting) not to get rid of this?

Perhaps it had this so that _SingleTaskVariationalGP would have a similar signature to SingleTaskVariationalGP? In any case, I don't see a good reason to keep it, especially since they differ in other ways at this point.

I think we should keep it to support inducing point allocators that take into account the target values. To this end, we'd need to add outputs (in addition to inputs) to the signature of

inducing_point_allocator.allocate_inducing_points

This would be ignored for the default variance reduction strategy, but would be necessary for e.g. @henrymoss's Inducing Point Allocation for Sparse Gaussian Processes in High-Throughput
Bayesian Optimisation. This would also require supporting outputs in the quality functions, but not a lot else as far as I can tell.

botorch/models/utils/inducing_point_allocators.py

test/models/utils/test_inducing_point_allocators.py

saitcakmak

Thanks! This was a tricky one to debug. More evidence that in-place operations and autograd do not mix well.

saitcakmak · 2023-06-16T16:41:57Z

test/models/utils/test_inducing_point_allocators.py

            train_train_kernel = (
                MaternKernel().to(self.device)(inputs).evaluate_kernel()
            )
-            quality_scores = torch.ones([10, 1], device=self.device)
+        quality_scores = torch.ones([10, 1], device=self.device)
+        with self.assertRaises(ValueError):


It'd be good explicitly check for the specific error using assertRaisesRegex.

definitely agree. I ignored it since this test was pre-existing (I just rearranged it). Changing a bunch of these to assertRaisesRegex could be a good starter task

codecov · 2023-06-16T16:43:11Z

Codecov Report

Merging #1890 (8f39b6e) into main (00fa4a8) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 8f39b6e differs from pull request most recent head 0ac6b14. Consider uploading reports for the commit 0ac6b14 to get more accurate results

@@            Coverage Diff            @@
##              main     #1890   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          173       173           
  Lines        15137     15147   +10     
=========================================
+ Hits         15137     15147   +10

Impacted Files	Coverage Δ
botorch/models/approximate_gp.py	`100.00% <ø> (ø)`
botorch/models/utils/inducing_point_allocators.py	`100.00% <100.00%> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

facebook-github-bot · 2023-06-16T16:50:25Z

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Fixes pytorch#1788 . ## Motivation `allocate_inducing_points` leaks memory when passed a `kernel_matrix` with `requires_grad=True`. The memory leak happens due to a specific pattern of in-place torch operations in `_pivoted_cholesky_init`; see [this comment](pytorch#1788 (comment)) for more explanation. There is no need for `allocate_inducing_points` to support a `kernel_matrix` with `requires_grad=True`, because the output of `allocate_inducing_points` is not differentiable anyway (thanks to in-place operations). [x] make `_pivoted_cholesky_init` raise an `UnsupportedError` when passed a `kernel_matrix` with `requires_grad=True`. That is mildly BC-breaking, but I think that is okay since the alternative is a memory leak. [x] Evaluate kernels with `torch.no_grad()` where they are only used to be passed to `_pivoted_cholesky_init` ### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)? Yes Pull Request resolved: pytorch#1890 Test Plan: [x] Unit test for memory leak [x] Unit test for UnsupportedError Reviewed By: saitcakmak, Balandat Differential Revision: D46803080 Pulled By: esantorella fbshipit-source-id: 1fb9c6500d4246a3740a9fce4bda290043f8ac3b

facebook-github-bot · 2023-06-17T00:35:39Z

This pull request was exported from Phabricator. Differential Revision: D46803080

Balandat · 2023-06-17T01:05:31Z

cc @henrymoss

facebook-github-bot · 2023-06-17T04:08:31Z

@esantorella merged this pull request in 80ec6f4.

esantorella requested review from Balandat and SebastianAment June 16, 2023 16:28

esantorella self-assigned this Jun 16, 2023

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jun 16, 2023

Balandat approved these changes Jun 16, 2023

View reviewed changes

saitcakmak approved these changes Jun 16, 2023

View reviewed changes

esantorella force-pushed the debug_ipa_mem_leak branch from c97d727 to 0ac6b14 Compare June 17, 2023 00:35

facebook-github-bot closed this in 80ec6f4 Jun 17, 2023

facebook-github-bot added the Merged label Jun 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory leak in inducing point allocators #1890

Fix memory leak in inducing point allocators #1890

esantorella commented Jun 16, 2023 •

edited

Loading

facebook-github-bot commented Jun 16, 2023

Balandat left a comment

Balandat Jun 16, 2023

esantorella Jun 16, 2023

SebastianAment Jun 16, 2023

saitcakmak left a comment

saitcakmak Jun 16, 2023

esantorella Jun 16, 2023

codecov bot commented Jun 16, 2023 •

edited

Loading

facebook-github-bot commented Jun 16, 2023

facebook-github-bot commented Jun 17, 2023

Balandat commented Jun 17, 2023

facebook-github-bot commented Jun 17, 2023

Fix memory leak in inducing point allocators #1890

Fix memory leak in inducing point allocators #1890

Conversation

esantorella commented Jun 16, 2023 • edited Loading

Motivation

Have you read the Contributing Guidelines on pull requests?

Test Plan

facebook-github-bot commented Jun 16, 2023

Balandat left a comment

Choose a reason for hiding this comment

Balandat Jun 16, 2023

Choose a reason for hiding this comment

esantorella Jun 16, 2023

Choose a reason for hiding this comment

SebastianAment Jun 16, 2023

Choose a reason for hiding this comment

saitcakmak left a comment

Choose a reason for hiding this comment

saitcakmak Jun 16, 2023

Choose a reason for hiding this comment

esantorella Jun 16, 2023

Choose a reason for hiding this comment

codecov bot commented Jun 16, 2023 • edited Loading

Codecov Report

facebook-github-bot commented Jun 16, 2023

facebook-github-bot commented Jun 17, 2023

Balandat commented Jun 17, 2023

facebook-github-bot commented Jun 17, 2023

esantorella commented Jun 16, 2023 •

edited

Loading

codecov bot commented Jun 16, 2023 •

edited

Loading