Ensemble Posterior #1636

jduerholt · 2023-01-20T16:28:56Z

Motivation

As discussed in #1064, this is an attempt to add a EnsemblePosterior to botorch, that could be used for example by NN ensembles.

I have problems with implementing rsample properly. I think my current implementation is not correct, it is based on DeterministicPosterior, but I think we should sample directly solutions from the individual predictions of the ensemble. But I do not know how to interprete sample_shape in this context.

As sampler, I registered the StochasticSampler for the new posterior class. But also, there I am not sure if this is correct. Furthermore, I have another question regarding StochasticSampler. It is stated in the docstring of StochasticSampler that it should not be used in combination with optimize_acqf. But StochasticSampler is assigned to the DeterministicPosterior. Does it mean that one cannot use a ModelList consisting of a DeterministicModel and GPs in combination with optimize_acqf?

@Balandat: any suggestions on this?

Have you read the Contributing Guidelines on pull requests?

Yes.

Test Plan

Unit tests. Not yet implemented/finished as it is still WIP.

codecov · 2023-01-20T16:40:03Z

Codecov Report

Merging #1636 (3e83ea1) into main (72e872a) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##              main    #1636      +/-   ##
===========================================
- Coverage   100.00%   99.99%   -0.01%     
===========================================
  Files          155      169      +14     
  Lines        13788    14444     +656     
===========================================
+ Hits         13788    14443     +655     
- Misses           0        1       +1

Impacted Files	Coverage Δ
botorch/models/deterministic.py	`100.00% <100.00%> (ø)`
botorch/models/ensemble.py	`100.00% <100.00%> (ø)`
botorch/posteriors/deterministic.py	`100.00% <100.00%> (ø)`
botorch/posteriors/ensemble.py	`100.00% <100.00%> (ø)`
botorch/sampling/get_sampler.py	`100.00% <100.00%> (ø)`
botorch/sampling/index_sampler.py	`100.00% <100.00%> (ø)`
botorch/optim/parameter_constraints.py	`99.27% <0.00%> (-0.73%)`	⬇️
botorch/optim/fit.py	`100.00% <0.00%> (ø)`
botorch/generation/gen.py	`100.00% <0.00%> (ø)`
botorch/optim/optimize.py	`100.00% <0.00%> (ø)`
... and 20 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Balandat · 2023-01-21T19:38:23Z

Hmm so one high level question I have here is why the EnsemblePosterior operates of a finite set of values rather than keeping a reference to the model and then evaluating that in the way @wjmaddox's code did here: https://github.com/samuelstanton/lambo/blob/dec2f303d847efefc4392ebfd2d78226d543388c/lambo/models/deep_ensemble.py#L177

Is the idea to just predict the output for all models in the model ensemble and then randomly sample from that (rather than re-evaluating a subset of the ensemble models)? I guess that seems reasonable if the ensemble models are explicitly defined and we don't have any scalability issues. But if the ensemble models are lazily defined by some parameters and only actually built upon evaluation this could cause issues.

Assuming we're in the case where we can easily do this, then I think there may be some confusion about sample_shape. Basically this being an ensemble of (I take it determinstic) models, the number of different samples that can possibly be drawn here is finite (s in your case). So it seems to me that rsample(sample_shape) should essentially grab sample_shape.numel() elements from your values tensor and then arrange it appropriately, sth like this:

indcs = torch.randint(values.shape[-1], sample_shape)
samples = values[..., indcs]
samples = samples.permute(
    *range(-1, -len(sample_shape) - 1, -1),
    *range(len(values) - 1)
)

It is stated in the docstring of StochasticSampler that it should not be used in combination with optimize_acqf. But StochasticSampler is assigned to the DeterministicPosterior. Does it mean that one cannot use a ModelList consisting of a DeterministicModel and GPs in combination with optimize_acqf?

So DeterministicPosterior isn't really doing anything, it's mostly just a crutch to make sure that deterministic "models" can be used in the same way as other probabilistic models. So using StochasticSampler is just out of convenience here. I guess the proper thing would be to have a DeterminsticSampler that doesn't do anything either, but I guess we wanted to avoid inflation of dummy classes here? cc @saitcakmak.

The point about not using a stochastic sampler with optimize_acqf is that by default we use optimization algorithms that assume a deterministic function and break in bad ways if that's not the case. But in this special case of a DeterministicPosterior there is actually no randomness and the used StochasticSampler does indeed produce deterministic values so all is good.

For this ensemble posterior I do think it makes sense to use StochasticSampler. You'll probably want to add the EnsemblePosterior to this check here though: https://github.com/pytorch/botorch/blob/main/botorch/sampling/stochastic_samplers.py#L80

Now if you do want to use deterministic optimization algorithms the idea would essentially be to fix the models that are used to generate the "samples". This would mean that the "base samples" are essentially the indices that you'd use to index into the values- these would stay fixed on the sampler, but a new model.posterior(X) call would generate a new values at a new X ( and so if the indices are fixed across evaluations and the model is differentiable then the samples will be differentiable w.r.t. X).

Does this make sense?

jduerholt · 2023-01-21T20:38:06Z

Hi @Balandat,

thanks for your detailed answer. This is just my first attempt to integrate an EnsemblePosterior into botorch, so I am very happy about your comments and open to changes of the concept.

To your high level question:

I saw @wjmaddox code, but I programmed it with the intention that we compute the output of all models first and then sample from this values as you described, as for example also a Random Forest is doing it. I have not thought about what you described about lazily defined models and/or scalability issues.

So what is your recommendation here? Which solution would you prefer? What I like about my proposal is the similarity to the other models and posteriors in botorch (I also added an abstract EnsembleModel class.). A model has a posterior method which calls the forward method in which inference is performed.

Thanks for your explanation for the StochasticSampler and the DeterministicModel. I was just wondering about the doctstring there. Maybe we should add there just a sentence regarding the DeterministicModel as exception.

My intention for the EnsemblePosterior is to be able to use it with deterministic optimization algorithms. So which sampler should one register for it to have the base_samples as indices that are fixed for consecutive calls? In case one wants to use the EnsemblePosterior with stochastic optimizer, one could then register another sampler.

If we would follow my initial idea, the abstract EnsembleModel class would be very similar to the DeterministicModel class. Only difference is in principle just the different posterior class. So one should thing about introducing a new base class from both DeterministicModel and EnsembleModel are inheriting from that implements an abstract method _posterior which returns the respective posterior objects.

Best,

Johannes

Balandat · 2023-01-21T21:19:35Z

So what is your recommendation here? Which solution would you prefer? What I like about my proposal is the similarity to the other models and posteriors in botorch (I also added an abstract EnsembleModel class.). A model has a posterior method which calls the forward method in which inference is performed.

Yes we definitely want to retain this API pattern. But one could do this in the two different ways I mentioned above.

Your version makes sense if there is a full ensemble of models of reasonably small ensemble size so that computing the values across all models is not a big deal and we can always do that. Then we could have the sampler just pick out a random set of ensemble items. Of course if they're all computed already and most of the work has been done it stands to reason why we would actually sample in the first place rather than just use all computed outputs.

The other idea would be to have the ensemble model be a model where the posterior call just produces an object that keeps a reference back to the model, similar to Wesley's implementation. Then sampling would mean selecting a subset of models from the ensemble to compute the output when sampling from the posterior. That is, models would be evaluated in a lazy fashion as requested by the sampler. This would have the benefit of avoiding wasting computation on evaluating all models in the ensemble model even if their outputs are never used.

Thanks for your explanation for the StochasticSampler and the DeterministicModel. I was just wondering about the doctstring there. Maybe we should add there just a sentence regarding the DeterministicModel as exception.

I wonder if we should just be very clear here by implementing a dummy sampler that just evaluates DeterministicModels rather than overloading the meaning of StochasticSampler. Curious about @saitcakmak's thoughts on this.

My intention for the EnsemblePosterior is to be able to use it with deterministic optimization algorithms. So which sampler should one register for it to have the base_samples as indices that are fixed for consecutive calls? In case one wants to use the EnsemblePosterior with stochastic optimizer, one could then register another sampler.

For deterministic optimization we'd probably want to implement something like an EnsembleSampler or IndexSampler. This would store the base samples as indices into the list of available samples and then index into the posterior whenever called on a Posterior object.

jduerholt · 2023-01-21T21:45:19Z

So one Todo would be definitely to implement this IndexSampler.

Concerning the two possible design patterns you outlined above, I am open to both, but the more I think about Wesleys implementation, the more I see its benefits. So I will change the PR accordingly and we decide to follow this design pattern.

Fine for you?

Balandat · 2023-01-21T22:06:28Z

I don't think it necessarily has to be either / or. One could have two different types of posterior, one that just contains a set of samples, and one that keeps the reference to the model and performs lazy evaluation. So we could start with whatever is easier / the more immediate need.

jduerholt · 2023-01-22T18:24:01Z

Ok, I will then just proceed with what I have currently and focus on implementing the IndexSampler and rsample correctly. Both is also needed for the other version. I will also change the default shape of values from batch_shape x q x m x sample_shape to
sample_shape x batch_shape x q x m, so that it is consistent with the rest of the code, as for example here:

botorch/botorch/acquisition/objective.py

Line 280 in 3596b12

samples: A `sample_shape x batch_shape x q x m`-dim Tensors of

It could be that there will be more questions over the course of the week ;-)

jduerholt · 2023-01-22T18:33:51Z

Here comes already the first question:

In the docstring of the abstract rsample method it is defined that sample_shape is a torch.Size object which specifies the number of samples and the number of batches to sample. Here it has to be different, or?

The IndexSampler would not pass a torch.Size object to rsample but a torch.Tensor of dtype=int64, holding the indices to sample for each batch. Is this new meaning of sample_shape in rsample ok?

Or should we implement rsample as you described above and in the usual way and implement a method rsample_by_indices for the EnsemblePosterior which is then called by the IndexSampler instead of the conventional rsample?

Best,

Johannes

Balandat · 2023-01-22T19:34:41Z

The IndexSampler would not pass a torch.Size object to rsample but a torch.Tensor of dtype=int64, holding the indices to sample for each batch. Is this new meaning of sample_shape in rsample ok?

We can consider the index samples produced by the sampler as "base samples". Check out how the Normal samplers do this where the base samples are iid Normal, they call a rsample_from_base_samples method defined on the posterior here: https://github.com/pytorch/botorch/blob/main/botorch/sampling/normal.py#L43-L49. You'd implement a similar method on the EnsemblePosterior. In EnsemblePosterior.rsample you can then just generate those samples (if not using a Sampler) and call rsample_from_base_samples, similar to what the GPyTorchPosterior does here: https://github.com/pytorch/botorch/blob/main/botorch/posteriors/gpytorch.py#L180-L182

saitcakmak · 2023-01-24T03:28:31Z

Sorry about the confusion around StochasticSampler and DeterministicModel. I was just avoiding creating a dummy class that's identical to the StochasticSampler. I'll add one to avoid any confusion going forward. Otherwise, @Balandat's explanation above is on point.

Regarding an IndexSampler, I agree that it'd be best to just use the indices as the base_samples and keep the API consistent by defining a rsample_from_base_samples that uses these index base_samples. The rsample can then simply sample sample_shape indices and pass these to rsample_from_base_samples as the base_samples.

The motivation behind rsample and rsample_from_base_samples is to keep the rsample API consistent with the rsample of torch.Distribution, which produces differentiable but not necessarily deterministic samples; and use rsample_from_base_samples to ensure that the samples are deterministic and can be optimized with a gradient based deterministic optimizer, such as the L-BFGS-B used in optimize_acqf.

jduerholt · 2023-01-28T21:34:14Z

Hi @saitcakmak and @Balandat,

I just added the rsample_from_base_samples (including an update of rsample) and the IndexSampler. Perhaps, you could have a look and tell me if this looks as you were describing it above.

Note, I kept the shape of values as batch_shape x q x m x posterior_size. For this reason, the sampling is conducted over the last dimension, followed by a reshape.

Best,

Johannes

P.S: I know it is buggy, I just want to make sure that it goes in the correct direction ;)

Balandat

Yup, this lgtm! I think one thing we could consider is have the ensemble dimension be between the batch shape and the q dimension - that feels a bit more natural to me as usually trailing dimensions are outcome dimensions throughout the code base. But really I'm not sure if it makes a difference.

botorch/models/ensemble.py

Balandat · 2023-01-29T18:18:56Z

botorch/models/ensemble.py

+        # Note: we use a Tensor instance check so that `observation_noise = True`
+        # just gets ignored. This avoids having to do a bunch of case distinctions
+        # when using a ModelList.
+        if isinstance(kwargs.get("observation_noise"), Tensor):
+            # TODO: Consider returning an MVN here instead
+            raise UnsupportedError(
+                "Deterministic models do not support observation noise."
+            )


One deeper question here is whether EnsembleModel should be limited to deterministic models or if (possibly down the road) want it to be more general and also support other models (e.g. GPs). We currently have a SaasFullyBayesianSingleTaskGP that essentially is a batched model whose posterior call returns a FullyBayesianPosterior.

It seems that a natural hierarchy would be to have generic Ensemble* versions from which we derive both the deterministic and GP ensemble models.

Not suggesting this is something we should do as part of this PR, but something to think about. cc @saitcakmak, @dme65, @sdaulton

I do not feel competent enough to answer this question. I commented below regarding the similarity between the current version of EnsembleModel and DeterministicModel.

Technically, everything is an ensemble of a single model, unless it is an ensemble of multiple models. I don't see any immediate benefit from such a refactor, so I'd say lets revisit this if we end up with more ensemble models in the future.

botorch/posteriors/ensemble.py

botorch/sampling/index_sampler.py

Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>

jduerholt · 2023-02-05T11:35:33Z

A few more minor points, o/w this looks great. I'd like to get some more input from other folks who have been working on the transforms more than I have more recently to make sure these changes cause any unforeseen issues.

I changed it according to your points. The one thing, I could not change accordingly is explained above. I am looking forward to the comments of the other folks and hope to get it merged soon ;)

saitcakmak

Thanks for implementing this. A bunch of nits in line, lgtm otherwise.

botorch/posteriors/ensemble.py

botorch/sampling/index_sampler.py

botorch/sampling/get_sampler.py

botorch/models/ensemble.py

saitcakmak · 2023-02-07T21:25:45Z

botorch/models/ensemble.py

+        # Note: we use a Tensor instance check so that `observation_noise = True`
+        # just gets ignored. This avoids having to do a bunch of case distinctions
+        # when using a ModelList.
+        if isinstance(kwargs.get("observation_noise"), Tensor):
+            # TODO: Consider returning an MVN here instead
+            raise UnsupportedError(
+                "Deterministic models do not support observation noise."
+            )


Technically, everything is an ensemble of a single model, unless it is an ensemble of multiple models. I don't see any immediate benefit from such a refactor, so I'd say lets revisit this if we end up with more ensemble models in the future.

botorch/models/ensemble.py

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

jduerholt · 2023-02-08T08:13:49Z

@saitcakmak : thanks for your review. I changed the PR according to your suggestions. I only let you a question for the DeterministicSampler.

facebook-github-bot · 2023-02-08T15:29:04Z

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

saitcakmak · 2023-02-09T05:37:24Z

Hmm, codecov shows missing coverage for a line that's just a comment (https://app.codecov.io/gh/pytorch/botorch/pull/1636#597cdf37e0736ca95369f0ff0f74eff4-R244). I'll re-run the tests to see if it disappears.

saitcakmak · 2023-02-09T05:57:22Z

I think something is wrong with codecov on that one. It's a file not touched by PR, the line is a comment and this diff doesn't remove any tests. I wouldn't worry about it

facebook-github-bot · 2023-02-10T05:38:25Z

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-02-11T21:26:56Z

@Balandat merged this pull request in a8efd76.

…ticSampler (pytorch#2391) Summary: ## Motivation Delete references to deprecated DeterministicPosterior and DeterministicSampler, replacing `DeterministicPosterior` with `EnsemblePosterior` where appropriate. These were deprecated before 0.9.0, so now that we are past 0.11.0, they can be reaped. Pull Request resolved: pytorch#2391 Test Plan: Replaced `DeterministicPosterior` with `EnsemblePosterior` in tests ## Related PRs pytorch#1636 Differential Revision: D59057165 Reviewed By: Balandat Pulled By: Balandat

…ticSampler (#2391) Summary: ## Motivation Delete references to deprecated DeterministicPosterior and DeterministicSampler, replacing `DeterministicPosterior` with `EnsemblePosterior` where appropriate. These were deprecated before 0.9.0, so now that we are past 0.11.0, they can be reaped. Pull Request resolved: #2391 Test Plan: Replaced `DeterministicPosterior` with `EnsemblePosterior` in tests ## Related PRs #1636 Reviewed By: Balandat Differential Revision: D59057165 Pulled By: esantorella fbshipit-source-id: 70a8405d0e75d414a685192808e8c0b18a6aca92

initial ensemble posterior.

4cdf44f

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jan 20, 2023

jduerholt added 5 commits January 20, 2023 17:56

fix ufmt

c1a752c

add initial tests

fa70134

add missing doc string for module

1308b76

run ufmt

236a7dd

add ensemble posterior to api doc

e260989

jduerholt added 2 commits January 21, 2023 20:48

add ensemble model

89a4d73

update sphinx doc

0a4145c

jduerholt mentioned this pull request Jan 23, 2023

general uncertainty/optimization wrapper for any model experimental-design/bofire#69

Open

add index sampler

5147c7a

jduerholt added 3 commits January 28, 2023 22:36

register index sampler for ensemble posterior

b9b9038

add missing stuff to index sampler

08c24b8

fix bug

96b1d03

jduerholt marked this pull request as draft January 28, 2023 21:53

Balandat reviewed Jan 29, 2023

View reviewed changes

jduerholt and others added 3 commits January 30, 2023 15:16

Update botorch/models/ensemble.py

0eb81ad

Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>

Update botorch/models/ensemble.py

935f733

Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>

Update botorch/posteriors/ensemble.py

eaed86f

Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>

jduerholt added 2 commits February 5, 2023 12:02

rename ensemble to ensemble_size

a1a8df7

fix black

7742f37

improve _extended_shape

d5cab70

saitcakmak approved these changes Feb 7, 2023

View reviewed changes

jduerholt and others added 12 commits February 8, 2023 08:55

Update botorch/posteriors/ensemble.py

c8381ef

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/posteriors/ensemble.py

16d5fe6

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/sampling/index_sampler.py

89b7676

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/sampling/index_sampler.py

b6685d7

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/sampling/index_sampler.py

132ded8

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/sampling/index_sampler.py

73861de

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/sampling/index_sampler.py

63b2c93

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/sampling/index_sampler.py

0bb1c7e

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/models/ensemble.py

168dc0e

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Update botorch/models/ensemble.py

7bf2808

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

update doc string

72c1022

update black

3e83ea1

facebook-github-bot closed this in a8efd76 Feb 11, 2023

facebook-github-bot added the Merged label Feb 11, 2023

saitcakmak mentioned this pull request May 31, 2024

Non GP model types in Botorch #1064

Closed

esantorella mentioned this pull request Jun 23, 2024

Delete references to deprecated DeterministicPosterior and DeterministicSampler #2391

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensemble Posterior #1636

Ensemble Posterior #1636

jduerholt commented Jan 20, 2023

codecov bot commented Jan 20, 2023 •

edited

Loading

Balandat commented Jan 21, 2023

jduerholt commented Jan 21, 2023

Balandat commented Jan 21, 2023

jduerholt commented Jan 21, 2023

Balandat commented Jan 21, 2023

jduerholt commented Jan 22, 2023

jduerholt commented Jan 22, 2023

Balandat commented Jan 22, 2023

saitcakmak commented Jan 24, 2023

jduerholt commented Jan 28, 2023 •

edited

Loading

Balandat left a comment

Balandat Jan 29, 2023

jduerholt Jan 31, 2023

saitcakmak Feb 7, 2023

jduerholt commented Feb 5, 2023

saitcakmak left a comment

saitcakmak Feb 7, 2023

jduerholt commented Feb 8, 2023

facebook-github-bot commented Feb 8, 2023

saitcakmak commented Feb 9, 2023

saitcakmak commented Feb 9, 2023

facebook-github-bot commented Feb 10, 2023

facebook-github-bot commented Feb 11, 2023

Ensemble Posterior #1636

Ensemble Posterior #1636

Conversation

jduerholt commented Jan 20, 2023

Motivation

Have you read the Contributing Guidelines on pull requests?

Test Plan

codecov bot commented Jan 20, 2023 • edited Loading

Codecov Report

Balandat commented Jan 21, 2023

jduerholt commented Jan 21, 2023

Balandat commented Jan 21, 2023

jduerholt commented Jan 21, 2023

Balandat commented Jan 21, 2023

jduerholt commented Jan 22, 2023

jduerholt commented Jan 22, 2023

Balandat commented Jan 22, 2023

saitcakmak commented Jan 24, 2023

jduerholt commented Jan 28, 2023 • edited Loading

Balandat left a comment

Choose a reason for hiding this comment

Balandat Jan 29, 2023

Choose a reason for hiding this comment

jduerholt Jan 31, 2023

Choose a reason for hiding this comment

saitcakmak Feb 7, 2023

Choose a reason for hiding this comment

jduerholt commented Feb 5, 2023

saitcakmak left a comment

Choose a reason for hiding this comment

saitcakmak Feb 7, 2023

Choose a reason for hiding this comment

jduerholt commented Feb 8, 2023

facebook-github-bot commented Feb 8, 2023

saitcakmak commented Feb 9, 2023

saitcakmak commented Feb 9, 2023

facebook-github-bot commented Feb 10, 2023

facebook-github-bot commented Feb 11, 2023

codecov bot commented Jan 20, 2023 •

edited

Loading

jduerholt commented Jan 28, 2023 •

edited

Loading