-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non GP model types in Botorch #1064
Comments
Yes, this is pretty possible (I have some research code doing exactly deep ensemble posteriors). In general, the mean and variance can be calculated using sample mean and variance of the network's output producing a normal approximation to the "posterior", while sampling can be done by selecting a random item or items in the list of networks. The only gotcha is in dealing with multi-batched data as some classes of NNs don't like that in torch (I'm thinking things like RNNs and LSTMs on string inputs for example). |
Yeah I did think about this when designing the APIs, so this should be possible without too much trouble (if not we should fix that). Basically, as long as you can have a posterior object that implements |
Thanks for the info. I will try to setup a simple MWE for this in the next month. Maybe I will come back with some questions then ;) |
Hi, I just tested using an NN ensemble within botorch and it works, both for analytical and as MC acqfs. I just represented the posterior as multivariate normal and used the Does this approach makes sense for you? Best, Johannes |
Hmm could you elaborate a bit more on what exactly you mean by
Is this using Wesley's approach of using the sample mean and variance from the NN output? If you do this for sampling this seems a bit odd since you use samples from the "true posterior" of the network to fit a MVN and then sample from that. Why not just use the outputs from the NN output directly as "samples"? You could have a lightweight wrapper posterior object that just references the network internally, and where |
Yes, I calculate the mean and variance over the prediction of each NN in the ensemble of NNs. With the mean and the variance alone, I can already use analytic ACQFS like EI, of course, this assumes, that the posterior is normal distributed. For being also able to use MC ACQFS, I just used But I think, I will also implement your suggestion of sampling the outputs directly. For BNNs one could then do the same. |
In case you haven't already implemented it. I've managed to open-source a deep ensemble posterior class here that should be pretty generic and works with batching (the other code inthe file is pretty tightly specified into our research codebase for that paper). @Balandat I'm happy to try to some variant of this up as a PR as well over the coming weeks if that'd be useful. |
Thanks for sharing. Looks promising. I also try to add my implementation based on the multivariate normal at some point in the next weeks. Then both options are available. |
@wjmaddox that would be awesome! Did you have an end to end example using this that you can point to? |
Yeah, here's roughly the link to the overall model class (https://github.com/samuelstanton/lambo/blob/7b67684b884f75f7007501978c5299514d0efb75/lambo/optimizers/pymoo.py#L343). As I think I mentioned previously, we were using genetic algorithms to optimize everything b/c the tasks we considered were discrete. @samuelstanton can walk you through more of the code if necessary. It's probably best for us to just pull out a simple notebook outside of our research code. |
cc @sdaulton this could be a good real-world test case for some of your discrete optimization work. |
@wjmaddox @samuelstanton: I had a closer look at your implementation of the ensemble posterior. I like it! I would be willing to create a PR based on it to bring it directly into botorch. I have one question. Maybe @Balandat could also help there: From what I saw the MC ACQFs in the recent implementations of botorch always use In @wjmaddox and @samuelstanton implementation, the |
Summary: <!-- Thank you for sending the PR! We appreciate you spending the time to make BoTorch better. Help us understand your motivation by explaining why you decided to make this change. You can learn more about contributing to BoTorch here: https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md --> ## Motivation As discussed in #1064, this is an attempt to add a `EnsemblePosterior` to botorch, that could be used for example by NN ensembles. I have problems with implementing `rsample` properly. I think my current implementation is not correct, it is based on `DeterministicPosterior`, but I think we should sample directly solutions from the individual predictions of the ensemble. But I do not know how to interprete `sample_shape` in this context. As sampler, I registered the `StochasticSampler` for the new posterior class. But also, there I am not sure if this is correct. Furthermore, I have another question regarding `StochasticSampler`. It is stated in the docstring of `StochasticSampler` that it should not be used in combination with `optimize_acqf`. But `StochasticSampler` is assigned to the `DeterministicPosterior`. Does it mean that one cannot use a `ModelList` consisting of a `DeterministicModel` and GPs in combination with `optimize_acqf`? Balandat: any suggestions on this? ### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)? Yes. Pull Request resolved: #1636 Test Plan: Unit tests. Not yet implemented/finished as it is still WIP. Reviewed By: saitcakmak Differential Revision: D43017184 Pulled By: Balandat fbshipit-source-id: fd2ede2dbba82a40c466f8a178138ced0fcba5fe
Resolved by #1636 |
Hi,
I was thinking about the possibility to use non GP models within botorch. For example to use a GP for one objective and a neural network (ensemble) for another one. Using just a neural network should already be possible via the
GenericDeterministcModel
botorch/botorch/models/deterministic.py
Line 83 in f8da711
f
.In this case, uncertainty estimates from a NN ensemble could not be used. My idea was to implement a new type of
Posterior
that takes also the variance from an NN ensemble and return it asvariance
of the posterior.botorch/botorch/posteriors/posterior.py
Line 56 in f8da711
This should it already allow to use the whole botorch machinery of analytical acquisition functions. Of course this assumes that the posterior is normally distributed. If one then also implements the
rsample
method of the posterior, then one should also be able to use the MC acquisiton functions.Do you see any obstacles in this?
I see the benefit in the possibility of using other model types in situations in which they perform better than GPs and do not have to reimplement the great machinere of acquisition functions and so forth, which are already available in botorch.
Best,
Johannes
The text was updated successfully, but these errors were encountered: