Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polytope sampling for linear constraints along the q-dimension #1757

Closed

Conversation

jduerholt
Copy link
Contributor

Motivation

This PR adds polytope sampling for linear constraints along the q-dimension for acqf optimization in this case without the need for providing batch initial conditions explicitly. The provided solution is discussed in issue #1737.

The PR is to 95% complete, but I am not sure with some naming things, so I named the two types of constraints in the method names often horizontal and vertical. I feel that there is better naming. So it would be nice if @Balandat or @saitcakmak could just have a look and give some hints, so that I can finish it. From the functionality perspective, it should be complete.

Have you read the Contributing Guidelines on pull requests?

Yes.

Test Plan

Unit tests.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Mar 21, 2023
@codecov
Copy link

codecov bot commented Mar 21, 2023

Codecov Report

Merging #1757 (2446cd4) into main (ba90c9b) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 2446cd4 differs from pull request most recent head d35aae1. Consider uploading reports for the commit d35aae1 to get more accurate results

@@            Coverage Diff             @@
##              main     #1757    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files          170       170            
  Lines        14621     14762   +141     
==========================================
+ Hits         14621     14762   +141     
Impacted Files Coverage Δ
botorch/acquisition/input_constructors.py 100.00% <ø> (ø)
botorch/acquisition/multi_objective/objective.py 100.00% <ø> (ø)
botorch/models/multitask.py 100.00% <ø> (ø)
botorch/models/utils/__init__.py 100.00% <ø> (ø)
botorch/acquisition/utils.py 100.00% <100.00%> (ø)
botorch/exceptions/warnings.py 100.00% <100.00%> (ø)
botorch/generation/gen.py 100.00% <100.00%> (ø)
botorch/models/deterministic.py 100.00% <100.00%> (ø)
botorch/models/pairwise_gp.py 100.00% <100.00%> (ø)
botorch/models/transforms/input.py 100.00% <100.00%> (ø)
... and 6 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@Balandat Balandat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think the horizontal / vertical constraint naming is not really clear until you already understand the setup very well. How about naming those something like intra/inter point constraints? Or pointwise / cross-batch? Open to other naming suggestions as well.

Other than that this lgtm apart from docstrings.

inequality_constraints=inequality_constraints,
equality_constraints=equality_constraints,
n=n,
bounds=torch.hstack([bounds for _ in range(q)]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the whole point of this change is to draw a n samples from a q*d-dim space, rather than n*q samples from a d-dim space. I don't have a great sense of this, but one thing that may happen by doing os is that you may need to increase the burnin and thinning options here to get comparable mixing behavior for the chain. I haven't really looked into this much, but it would be worth trying to do a basic numerical study that compares this (without actually imposing the cross-batch constraints) just to see how much of an issue this is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I did some easy test and looked at the autocorrelation of the samples and indeed it has an influence:

In the old way, I get for a scenario with a q-batch of 4 and 1024 raw samples the following autocorrealtion:

import numpy as np
import torch
from botorch.utils.sampling import get_polytope_samples
from botorch.optim.initializers import transform_constraints
import matplotlib.pyplot as plt


tkwargs = {'dtype': torch.float64, 'device': 'cpu'}

bounds = torch.tensor([[0,0,0,0,0], [1,1,1,1,1]]).to(**tkwargs)

equalities = [
    (torch.tensor([0,1,2,3], dtype=torch.int64), torch.tensor([1,1,1,1]).to(**tkwargs), 1.)
]

inequalities = [
    (torch.tensor([0,1], dtype=torch.int64), torch.tensor([1,1]).to(**tkwargs), 0.3)
]

q = 4
raw_samples=1024

samples = get_polytope_samples(
    n=q*raw_samples,
    bounds=bounds,
    inequality_constraints=inequalities,
    equality_constraints=equalities,
    n_burnin=10000,
    thinning=32
)

samples = samples.detach().numpy()

plt.acorr(samples[:,0], maxlags=24,normed=True)
plt.show()

image

Doing it the new way via sampling from a q*d space, I get the following:

from botorch.optim.initializers import transform_constraints

samples2 = get_polytope_samples(
    n=raw_samples,
    bounds=torch.hstack([bounds for _ in range(q)]),
    inequality_constraints=transform_constraints(inequalities,q=q,d=5),
    equality_constraints=transform_constraints(equalities,q=q,d=5),
    n_burnin=10000,
    thinning=32
)

samples2 = samples2.reshape(torch.Size((q*raw_samples, 5)))

samples2 = samples2.detach().numpy()

plt.acorr(samples2[:,0], maxlags=24,normed=True)
plt.show()

image

We see higher correlation for every fourth sample, which makes sense as four samples here correspond to one polytope sample in the new setup. Setting thinning to 32*q cures the problem:

image

So I changed the default to always 32*q.

What do you think? Is this sufficient?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great analysis! (cc @jelena-markovic). Yes, I think this makes a lot of sense, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, one concern here is that you are transforming these constraints regardless of whether there are any inter-point / cross-batch constraints present. That means that in that case we're running the hit-and-run sampler in an unnecessarily high-dimensional space, and could instead just run it for n*q samples as we did before. Could you please modify this logic here to do this branching? Would be good to move this whole branch starting in l 257 out into a helper function to keep things tidy.

jduerholt and others added 4 commits March 22, 2023 19:32
@jduerholt jduerholt requested a review from Balandat March 23, 2023 21:22
Copy link
Contributor

@Balandat Balandat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this. I do have one request regarding not unnecessarily increasing the dimensionality for the sampler if there aren't any inter-point / cross-batch constraints present.

inequality_constraints=inequality_constraints,
equality_constraints=equality_constraints,
n=n,
bounds=torch.hstack([bounds for _ in range(q)]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great analysis! (cc @jelena-markovic). Yes, I think this makes a lot of sense, thanks!

inequality_constraints=inequality_constraints,
equality_constraints=equality_constraints,
n=n,
bounds=torch.hstack([bounds for _ in range(q)]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, one concern here is that you are transforming these constraints regardless of whether there are any inter-point / cross-batch constraints present. That means that in that case we're running the hit-and-run sampler in an unnecessarily high-dimensional space, and could instead just run it for n*q samples as we did before. Could you please modify this logic here to do this branching? Would be good to move this whole branch starting in l 257 out into a helper function to keep things tidy.

jduerholt and others added 11 commits March 25, 2023 19:42
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
@jduerholt
Copy link
Contributor Author

Thanks for iterating on this. I do have one request regarding not unnecessarily increasing the dimensionality for the sampler if there aren't any inter-point / cross-batch constraints present.

Done ;)

@jduerholt jduerholt requested a review from Balandat March 27, 2023 17:54
Copy link
Contributor

@Balandat Balandat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most excellent, this looks great. A couple of tiny nits, but o/w ready to merge in!

@facebook-github-bot
Copy link
Contributor

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jduerholt and others added 2 commits March 28, 2023 08:12
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
jduerholt and others added 2 commits March 28, 2023 08:13
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
@jduerholt
Copy link
Contributor Author

@Balandat: I just incorporated your suggestions. Should be final now ;)

Copy link
Contributor

@Balandat Balandat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost - actually, this throws a test failure on the GPU due to device mismatch, see inline comment for the fix.

@facebook-github-bot
Copy link
Contributor

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jduerholt and others added 3 commits March 28, 2023 16:02
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
@Balandat
Copy link
Contributor

Looks like one more ufmt pass is needed. Home stretch.

@jduerholt
Copy link
Contributor Author

Looks like one more ufmt pass is needed. Home stretch.

Should be fine now ;) I want to have a button for this directly in github ;)

@Balandat
Copy link
Contributor

I want to have a button for this directly in github ;)

GitHub may accept PRs as well...

@facebook-github-bot
Copy link
Contributor

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@Balandat
Copy link
Contributor

Found another cuda issue, but fixed that internally. All good from your end. Thanks!

@facebook-github-bot
Copy link
Contributor

@Balandat merged this pull request in bb020d3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants