-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buxfix for Proximal acquisition function wrapper for negative base acquisition functions #1447
Conversation
…to solve issue with qExpectedHypervolumeImprovement
Codecov Report
@@ Coverage Diff @@
## main #1447 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 124 124
Lines 11548 11554 +6
=========================================
+ Hits 11548 11554 +6
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, lgtm! The tutorial failure is unrelated (#1446), so please ignore it.
@saitcakmak has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
This is a higher level design question, rather than a comment on the particular fix. @roussel-ryan, is there a reason against simply adding a quadratic to an arbitrary acquisition function? With this fix,
where
In addition, the latter would not change the relative magnitude of the acquisition function to the proximal term if the acquisition is negative. If it takes a large negative value, |
Thanks for fixing this. One thing that we should keep in mind here is that if the acquisition function values are small (as is often the case later during the optimization for things like Expected Improvement) this could substantially impact the behavior for functions that are indeed guaranteed to be positive. So in that case it seems that the choice of a large |
@SebastianAment In response to large negative values for the base acquisition function I'm hoping that using a standardize outcome transform will prevent extreme values. It's not a perfect solution but I think for most cases the proposed solution is sufficient. As for adding the biasing, instead of multiplying it you mentioned that the relative scale of each term would remain the same, but I think it presents the biasing from working when the acquisition function is large (or vice versa). Unless I'm misunderstanding what you wrote? |
@Balandat I could potentially remove the transform when all of the acquisition function values are found to be positive. Alternatively I could set a large beta as the default or raise an error when the acqf values are negative. We need to prevent it from ever being used with negative acqf because the issue can be quite easy to overlook |
That makes sense. The multiplicative version lets us choose weights proportional to the scale of the acquisition functions, which we might not know a priori but could update during a run. It's also equivalent to the additive version by taking the log of the acquisition function, and that might be more easily applied to negative acquisition values. |
I am not sure how widely used this is. The main thing I'm worried about is that the default behavior for non-negative but small-in-magnitude acquisition functions may change quite a bit with this change. How about we (i) make |
@Balandat made the change you suggested |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! A couple of nits, o/w this lgtm.
botorch/acquisition/proximal.py
Outdated
@@ -49,6 +53,7 @@ def __init__( | |||
acq_function: AcquisitionFunction, | |||
proximal_weights: Tensor, | |||
transformed_weighting: bool = True, | |||
beta: float = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beta: float = None, | |
beta: Optional[float] = None, |
test/acquisition/test_proximal.py
Outdated
|
||
ei_prox_beta = EI_prox_beta(test_X) | ||
self.assertTrue(torch.allclose(ei_prox_beta, ei * test_prox_weight)) | ||
self.assertTrue(ei_prox_beta.shape == torch.Size([1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
self.assertTrue(ei_prox_beta.shape == torch.Size([1])) | |
self.assertEqual(ei_prox_beta.shape, torch.Size([1])) |
test/acquisition/test_proximal.py
Outdated
test_prox_weight = torch.exp( | ||
mv_normal.log_prob(proximal_test_X) | ||
) / torch.exp(mv_normal.log_prob(last_X)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_prox_weight = torch.exp( | |
mv_normal.log_prob(proximal_test_X) | |
) / torch.exp(mv_normal.log_prob(last_X)) | |
test_prox_weight = torch.exp( | |
mv_normal.log_prob(proximal_test_X) - | |
mv_normal.log_prob(last_X) | |
) |
botorch/acquisition/proximal.py
Outdated
acquisition function. Then the acquisition function is | ||
weighted via a squared exponential centered at the last training point, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acquisition function. Then the acquisition function is | |
weighted via a squared exponential centered at the last training point, | |
acquisition function. The acquisition function is | |
weighted via a squared exponential centered at the last training point, |
botorch/acquisition/proximal.py
Outdated
weighted via a squared exponential centered at the last training point, | ||
with varying lengthscales corresponding to `proximal_weights`. Can only be used | ||
with acquisition functions based on single batch models. Acquisition functions | ||
must be positive or `beta` is specified to apply a SoftPlus transform before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
must be positive or `beta` is specified to apply a SoftPlus transform before | |
must be positive or `beta` must be specified to apply a SoftPlus transform before |
test/acquisition/test_proximal.py
Outdated
with self.assertRaises(RuntimeError): | ||
bad_neg_prox(test_X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use assertRaisesRegex
in order to check for the proper error message here?
Updated w/suggestions. Thanks for the comments! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, thanks for the contribution!
@saitcakmak I think this is ready for a re-import and merge!
@saitcakmak has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Motivation
This PR fixes a major issue when using the
ProximalAcquisitionFunction
with base acquisition functions that are not strictly positive. This PR fixes it by applying a Softplus transformation to the base acquisition function values (using optional beta = 1,0 argument) before multiplying by proximal weighting.Have you read the Contributing Guidelines on pull requests?
Yes
Test Plan
Tests have been updated with correct (softplus transformed) values.