-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensemble Posterior #1636
Ensemble Posterior #1636
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1636 +/- ##
===========================================
- Coverage 100.00% 99.99% -0.01%
===========================================
Files 155 169 +14
Lines 13788 14444 +656
===========================================
+ Hits 13788 14443 +655
- Misses 0 1 +1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Hmm so one high level question I have here is why the Is the idea to just predict the output for all models in the model ensemble and then randomly sample from that (rather than re-evaluating a subset of the ensemble models)? I guess that seems reasonable if the ensemble models are explicitly defined and we don't have any scalability issues. But if the ensemble models are lazily defined by some parameters and only actually built upon evaluation this could cause issues. Assuming we're in the case where we can easily do this, then I think there may be some confusion about
So The point about not using a stochastic sampler with For this ensemble posterior I do think it makes sense to use Now if you do want to use deterministic optimization algorithms the idea would essentially be to fix the models that are used to generate the "samples". This would mean that the "base samples" are essentially the indices that you'd use to index into the Does this make sense? |
Hi @Balandat, thanks for your detailed answer. This is just my first attempt to integrate an To your high level question: I saw @wjmaddox code, but I programmed it with the intention that we compute the output of all models first and then sample from this values as you described, as for example also a Random Forest is doing it. I have not thought about what you described about lazily defined models and/or scalability issues. So what is your recommendation here? Which solution would you prefer? What I like about my proposal is the similarity to the other models and posteriors in botorch (I also added an abstract Thanks for your explanation for the My intention for the If we would follow my initial idea, the abstract Best, Johannes |
Yes we definitely want to retain this API pattern. But one could do this in the two different ways I mentioned above. Your version makes sense if there is a full ensemble of models of reasonably small ensemble size so that computing the values across all models is not a big deal and we can always do that. Then we could have the sampler just pick out a random set of ensemble items. Of course if they're all computed already and most of the work has been done it stands to reason why we would actually sample in the first place rather than just use all computed outputs. The other idea would be to have the ensemble model be a model where the
I wonder if we should just be very clear here by implementing a dummy sampler that just evaluates
For deterministic optimization we'd probably want to implement something like an |
So one Todo would be definitely to implement this Concerning the two possible design patterns you outlined above, I am open to both, but the more I think about Wesleys implementation, the more I see its benefits. So I will change the PR accordingly and we decide to follow this design pattern. Fine for you? |
I don't think it necessarily has to be either / or. One could have two different types of posterior, one that just contains a set of samples, and one that keeps the reference to the model and performs lazy evaluation. So we could start with whatever is easier / the more immediate need. |
Ok, I will then just proceed with what I have currently and focus on implementing the botorch/botorch/acquisition/objective.py Line 280 in 3596b12
It could be that there will be more questions over the course of the week ;-) |
Here comes already the first question: In the docstring of the abstract The Or should we implement Best, Johannes |
We can consider the index samples produced by the sampler as "base samples". Check out how the Normal samplers do this where the base samples are iid Normal, they call a |
Sorry about the confusion around Regarding an The motivation behind |
Hi @saitcakmak and @Balandat, I just added the Note, I kept the shape of Best, Johannes P.S: I know it is buggy, I just want to make sure that it goes in the correct direction ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this lgtm! I think one thing we could consider is have the ensemble dimension be between the batch shape and the q dimension - that feels a bit more natural to me as usually trailing dimensions are outcome dimensions throughout the code base. But really I'm not sure if it makes a difference.
botorch/models/ensemble.py
Outdated
# Note: we use a Tensor instance check so that `observation_noise = True` | ||
# just gets ignored. This avoids having to do a bunch of case distinctions | ||
# when using a ModelList. | ||
if isinstance(kwargs.get("observation_noise"), Tensor): | ||
# TODO: Consider returning an MVN here instead | ||
raise UnsupportedError( | ||
"Deterministic models do not support observation noise." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One deeper question here is whether EnsembleModel
should be limited to deterministic models or if (possibly down the road) want it to be more general and also support other models (e.g. GPs). We currently have a SaasFullyBayesianSingleTaskGP
that essentially is a batched model whose posterior
call returns a FullyBayesianPosterior
.
It seems that a natural hierarchy would be to have generic Ensemble*
versions from which we derive both the deterministic and GP ensemble models.
Not suggesting this is something we should do as part of this PR, but something to think about. cc @saitcakmak, @dme65, @sdaulton
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not feel competent enough to answer this question. I commented below regarding the similarity between the current version of EnsembleModel
and DeterministicModel
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, everything is an ensemble of a single model, unless it is an ensemble of multiple models. I don't see any immediate benefit from such a refactor, so I'd say lets revisit this if we end up with more ensemble models in the future.
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
Co-authored-by: Max Balandat <Balandat@users.noreply.github.com>
I changed it according to your points. The one thing, I could not change accordingly is explained above. I am looking forward to the comments of the other folks and hope to get it merged soon ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this. A bunch of nits in line, lgtm otherwise.
botorch/models/ensemble.py
Outdated
# Note: we use a Tensor instance check so that `observation_noise = True` | ||
# just gets ignored. This avoids having to do a bunch of case distinctions | ||
# when using a ModelList. | ||
if isinstance(kwargs.get("observation_noise"), Tensor): | ||
# TODO: Consider returning an MVN here instead | ||
raise UnsupportedError( | ||
"Deterministic models do not support observation noise." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, everything is an ensemble of a single model, unless it is an ensemble of multiple models. I don't see any immediate benefit from such a refactor, so I'd say lets revisit this if we end up with more ensemble models in the future.
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>
@saitcakmak : thanks for your review. I changed the PR according to your suggestions. I only let you a question for the |
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Hmm, codecov shows missing coverage for a line that's just a comment (https://app.codecov.io/gh/pytorch/botorch/pull/1636#597cdf37e0736ca95369f0ff0f74eff4-R244). I'll re-run the tests to see if it disappears. |
I think something is wrong with codecov on that one. It's a file not touched by PR, the line is a comment and this diff doesn't remove any tests. I wouldn't worry about it |
@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…ticSampler (pytorch#2391) Summary: ## Motivation Delete references to deprecated DeterministicPosterior and DeterministicSampler, replacing `DeterministicPosterior` with `EnsemblePosterior` where appropriate. These were deprecated before 0.9.0, so now that we are past 0.11.0, they can be reaped. Pull Request resolved: pytorch#2391 Test Plan: Replaced `DeterministicPosterior` with `EnsemblePosterior` in tests ## Related PRs pytorch#1636 Differential Revision: D59057165 Reviewed By: Balandat Pulled By: Balandat
…ticSampler (#2391) Summary: ## Motivation Delete references to deprecated DeterministicPosterior and DeterministicSampler, replacing `DeterministicPosterior` with `EnsemblePosterior` where appropriate. These were deprecated before 0.9.0, so now that we are past 0.11.0, they can be reaped. Pull Request resolved: #2391 Test Plan: Replaced `DeterministicPosterior` with `EnsemblePosterior` in tests ## Related PRs #1636 Reviewed By: Balandat Differential Revision: D59057165 Pulled By: esantorella fbshipit-source-id: 70a8405d0e75d414a685192808e8c0b18a6aca92
Motivation
As discussed in #1064, this is an attempt to add a
EnsemblePosterior
to botorch, that could be used for example by NN ensembles.I have problems with implementing
rsample
properly. I think my current implementation is not correct, it is based onDeterministicPosterior
, but I think we should sample directly solutions from the individual predictions of the ensemble. But I do not know how to interpretesample_shape
in this context.As sampler, I registered the
StochasticSampler
for the new posterior class. But also, there I am not sure if this is correct. Furthermore, I have another question regardingStochasticSampler
. It is stated in the docstring ofStochasticSampler
that it should not be used in combination withoptimize_acqf
. ButStochasticSampler
is assigned to theDeterministicPosterior
. Does it mean that one cannot use aModelList
consisting of aDeterministicModel
and GPs in combination withoptimize_acqf
?@Balandat: any suggestions on this?
Have you read the Contributing Guidelines on pull requests?
Yes.
Test Plan
Unit tests. Not yet implemented/finished as it is still WIP.