-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polytope sampling for linear constraints along the q-dimension #1757
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1757 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 170 170
Lines 14621 14762 +141
==========================================
+ Hits 14621 14762 +141
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think the horizontal / vertical constraint naming is not really clear until you already understand the setup very well. How about naming those something like intra/inter point constraints? Or pointwise / cross-batch? Open to other naming suggestions as well.
Other than that this lgtm apart from docstrings.
botorch/optim/initializers.py
Outdated
inequality_constraints=inequality_constraints, | ||
equality_constraints=equality_constraints, | ||
n=n, | ||
bounds=torch.hstack([bounds for _ in range(q)]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the whole point of this change is to draw a n
samples from a q*d
-dim space, rather than n*q
samples from a d
-dim space. I don't have a great sense of this, but one thing that may happen by doing os is that you may need to increase the burnin and thinning options here to get comparable mixing behavior for the chain. I haven't really looked into this much, but it would be worth trying to do a basic numerical study that compares this (without actually imposing the cross-batch constraints) just to see how much of an issue this is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I did some easy test and looked at the autocorrelation of the samples and indeed it has an influence:
In the old way, I get for a scenario with a q-batch of 4 and 1024 raw samples the following autocorrealtion:
import numpy as np
import torch
from botorch.utils.sampling import get_polytope_samples
from botorch.optim.initializers import transform_constraints
import matplotlib.pyplot as plt
tkwargs = {'dtype': torch.float64, 'device': 'cpu'}
bounds = torch.tensor([[0,0,0,0,0], [1,1,1,1,1]]).to(**tkwargs)
equalities = [
(torch.tensor([0,1,2,3], dtype=torch.int64), torch.tensor([1,1,1,1]).to(**tkwargs), 1.)
]
inequalities = [
(torch.tensor([0,1], dtype=torch.int64), torch.tensor([1,1]).to(**tkwargs), 0.3)
]
q = 4
raw_samples=1024
samples = get_polytope_samples(
n=q*raw_samples,
bounds=bounds,
inequality_constraints=inequalities,
equality_constraints=equalities,
n_burnin=10000,
thinning=32
)
samples = samples.detach().numpy()
plt.acorr(samples[:,0], maxlags=24,normed=True)
plt.show()
Doing it the new way via sampling from a q*d space, I get the following:
from botorch.optim.initializers import transform_constraints
samples2 = get_polytope_samples(
n=raw_samples,
bounds=torch.hstack([bounds for _ in range(q)]),
inequality_constraints=transform_constraints(inequalities,q=q,d=5),
equality_constraints=transform_constraints(equalities,q=q,d=5),
n_burnin=10000,
thinning=32
)
samples2 = samples2.reshape(torch.Size((q*raw_samples, 5)))
samples2 = samples2.detach().numpy()
plt.acorr(samples2[:,0], maxlags=24,normed=True)
plt.show()
We see higher correlation for every fourth sample, which makes sense as four samples here correspond to one polytope sample in the new setup. Setting thinning
to 32*q
cures the problem:
So I changed the default to always 32*q.
What do you think? Is this sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great analysis! (cc @jelena-markovic). Yes, I think this makes a lot of sense, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, one concern here is that you are transforming these constraints regardless of whether there are any inter-point / cross-batch constraints present. That means that in that case we're running the hit-and-run sampler in an unnecessarily high-dimensional space, and could instead just run it for n*q
samples as we did before. Could you please modify this logic here to do this branching? Would be good to move this whole branch starting in l 257 out into a helper function to keep things tidy.
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating on this. I do have one request regarding not unnecessarily increasing the dimensionality for the sampler if there aren't any inter-point / cross-batch constraints present.
botorch/optim/initializers.py
Outdated
inequality_constraints=inequality_constraints, | ||
equality_constraints=equality_constraints, | ||
n=n, | ||
bounds=torch.hstack([bounds for _ in range(q)]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great analysis! (cc @jelena-markovic). Yes, I think this makes a lot of sense, thanks!
botorch/optim/initializers.py
Outdated
inequality_constraints=inequality_constraints, | ||
equality_constraints=equality_constraints, | ||
n=n, | ||
bounds=torch.hstack([bounds for _ in range(q)]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, one concern here is that you are transforming these constraints regardless of whether there are any inter-point / cross-batch constraints present. That means that in that case we're running the hit-and-run sampler in an unnecessarily high-dimensional space, and could instead just run it for n*q
samples as we did before. Could you please modify this logic here to do this branching? Would be good to move this whole branch starting in l 257 out into a helper function to keep things tidy.
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Done ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most excellent, this looks great. A couple of tiny nits, but o/w ready to merge in!
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
@Balandat: I just incorporated your suggestions. Should be final now ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost - actually, this throws a test failure on the GPU due to device mismatch, see inline comment for the fix.
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Looks like one more |
Should be fine now ;) I want to have a button for this directly in github ;) |
GitHub may accept PRs as well... |
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Found another cuda issue, but fixed that internally. All good from your end. Thanks! |
Motivation
This PR adds polytope sampling for linear constraints along the q-dimension for acqf optimization in this case without the need for providing batch initial conditions explicitly. The provided solution is discussed in issue #1737.
The PR is to 95% complete, but I am not sure with some naming things, so I named the two types of constraints in the method names often
horizontal
andvertical
. I feel that there is better naming. So it would be nice if @Balandat or @saitcakmak could just have a look and give some hints, so that I can finish it. From the functionality perspective, it should be complete.Have you read the Contributing Guidelines on pull requests?
Yes.
Test Plan
Unit tests.