Performant Scaling of `BlockDiagLinearOperator` by `DiagLinearOperator` #14

SebastianAment · 2022-09-13T18:36:25Z

The primary goal of this PR is to enable the efficient scaling of BlockDiagLinearOperators by DiagLinearOperators. This will allow us to remove a special case in BoTorch's outcome transform.

In order to achieve this, this PR modifies and adds special cases to DiagLinearOperator and BlockDiagLinearOperator's matmul.

I tested the notebook that exhibited the regression by patching in the function definitions here and ended up with a 16 second runtime - as opposed to 30+ minutes before.

SebastianAment · 2022-09-13T18:49:56Z

linear_operator/operators/_linear_operator.py

@@ -558,12 +558,7 @@ def _mul_matrix(self, other: Union[torch.Tensor, "LinearOperator"]) -> LinearOpe
        if isinstance(self, DenseLinearOperator) or isinstance(other, DenseLinearOperator):


This line was the source of the performance regressions. The reasoning behind it appears to be because MulLinearOperator always computes a root decomposition, which is both inefficient and introduces dead code in its implementation (see below). I am sidestepping this by replacing the * with secondary @ operators in the new special cases of the DiagLinearOperator and BlockDiagLinearOperator matmul methods, leading to MatmulLinearOperators instead.

However, this does not get rid of the more general issue. To fix that, I propose two steps in a future PR:

Introducing logic in the constructor of MulLinearOperator that decides whether or not to build a root decomposition.

Even if a root decomposition seems beneficial, delaying its computation until the very last moment when it is needed in matmul and cache the result. This will give us ~0 overhead in the case where the linear operator represents a posterior covariance matrix that is constructed via a posterior call but only the posterior mean is needed, as was the case in the notebook that exhibited the regression.

I think this change should be fine, since the MulLinearOpeator constructor performs root decompositions on left_linear_op and right_linear_op:

linear_operator/linear_operator/operators/mul_linear_operator.py

Line 25 in caead4d

if not isinstance(left_linear_op, RootLinearOperator):

Even if a root decomposition seems beneficial, delaying its computation until the very last moment when it is needed in matmul and cache the result.

Agreed.

linear_operator/operators/dense_linear_operator.py

Balandat

Overall this lgtm but I'll let @gpleiss double check the broader implications of this change.

linear_operator/operators/block_diag_linear_operator.py

linear_operator/operators/diag_linear_operator.py

linear_operator/operators/mul_linear_operator.py

linear_operator/operators/block_diag_linear_operator.py

gpleiss · 2022-09-22T13:03:15Z

linear_operator/operators/_linear_operator.py

@@ -558,12 +558,7 @@ def _mul_matrix(self, other: Union[torch.Tensor, "LinearOperator"]) -> LinearOpe
        if isinstance(self, DenseLinearOperator) or isinstance(other, DenseLinearOperator):


I think this change should be fine, since the MulLinearOpeator constructor performs root decompositions on left_linear_op and right_linear_op:

linear_operator/linear_operator/operators/mul_linear_operator.py

Line 25 in caead4d

if not isinstance(left_linear_op, RootLinearOperator):

Even if a root decomposition seems beneficial, delaying its computation until the very last moment when it is needed in matmul and cache the result.

Agreed.

…(Take 2) Summary: Due to [this linear operator PR](cornellius-gp/linear_operator#14), we should now be able to remove the custom logic in `Standardize` without performance impact. Differential Revision: D39746709 fbshipit-source-id: d5bad3c56254a9820dcfe60aa3a9c8dd1f5edb59

…(Take 2) (pytorch#1414) Summary: Pull Request resolved: pytorch#1414 Due to [this linear operator PR](cornellius-gp/linear_operator#14), we should now be able to remove the custom logic in `Standardize` without performance impact. Differential Revision: D39746709 fbshipit-source-id: 506d1fc3a34778fa5bb0d91779ca5f73b24f4146

…(Take 2) (pytorch#1414) Summary: Pull Request resolved: pytorch#1414 Due to [this linear operator PR](cornellius-gp/linear_operator#14), we should now be able to remove the custom logic in `Standardize` without performance impact. Differential Revision: D39746709 fbshipit-source-id: 1c68d8033d18c4a489600dd743ad5ab24efc5fb0

…(Take 2) (pytorch#1414) Summary: Pull Request resolved: pytorch#1414 Due to [this linear operator PR](cornellius-gp/linear_operator#14), we should now be able to remove the custom logic in `Standardize` without performance impact. Reviewed By: saitcakmak Differential Revision: D39746709 fbshipit-source-id: c1477bcc14ec145583a5d0501fbe1cdac5bfe9bd

…(Take 2) (#1414) Summary: Pull Request resolved: #1414 Due to [this linear operator PR](cornellius-gp/linear_operator#14), we should now be able to remove the custom logic in `Standardize` without performance impact. Reviewed By: saitcakmak Differential Revision: D39746709 fbshipit-source-id: 286b092f073861cb52d409ef85ff3dc9047bae4a

SebastianAment force-pushed the diagonal-performance-improvements branch from 37635f9 to 9e10bc8 Compare September 13, 2022 18:47

SebastianAment commented Sep 13, 2022

View reviewed changes

added special cases to diag matmul

a93fe94

SebastianAment force-pushed the diagonal-performance-improvements branch from 9e10bc8 to a93fe94 Compare September 13, 2022 18:51

SebastianAment commented Sep 13, 2022

View reviewed changes

linear_operator/operators/dense_linear_operator.py Show resolved Hide resolved

Merge branch 'main' into diagonal-performance-improvements

0c64872

Balandat approved these changes Sep 13, 2022

View reviewed changes

linear_operator/operators/block_diag_linear_operator.py Show resolved Hide resolved

linear_operator/operators/diag_linear_operator.py Show resolved Hide resolved

linear_operator/operators/mul_linear_operator.py Show resolved Hide resolved

gpleiss approved these changes Sep 22, 2022

View reviewed changes

Update linear_operator/operators/block_diag_linear_operator.py

5911406

gpleiss enabled auto-merge (squash) September 22, 2022 13:05

gpleiss merged commit 3a37f0c into cornellius-gp:main Sep 22, 2022

SebastianAment mentioned this pull request Sep 22, 2022

Removing custom BlockDiagLazyTensor logic when using Standardize (Take 2) pytorch/botorch#1414

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performant Scaling of `BlockDiagLinearOperator` by `DiagLinearOperator` #14

Performant Scaling of `BlockDiagLinearOperator` by `DiagLinearOperator` #14

SebastianAment commented Sep 13, 2022

SebastianAment Sep 13, 2022

gpleiss Sep 22, 2022

Balandat left a comment

gpleiss Sep 22, 2022

		@@ -558,12 +558,7 @@ def _mul_matrix(self, other: Union[torch.Tensor, "LinearOperator"]) -> LinearOpe
		if isinstance(self, DenseLinearOperator) or isinstance(other, DenseLinearOperator):

Performant Scaling of BlockDiagLinearOperator by DiagLinearOperator #14

Performant Scaling of BlockDiagLinearOperator by DiagLinearOperator #14

Conversation

SebastianAment commented Sep 13, 2022

SebastianAment Sep 13, 2022

Choose a reason for hiding this comment

gpleiss Sep 22, 2022

Choose a reason for hiding this comment

Balandat left a comment

Choose a reason for hiding this comment

gpleiss Sep 22, 2022

Choose a reason for hiding this comment

Performant Scaling of `BlockDiagLinearOperator` by `DiagLinearOperator` #14

Performant Scaling of `BlockDiagLinearOperator` by `DiagLinearOperator` #14