Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add surrogate models that can extrapolate #286

Closed
dlinzner-bcs opened this issue Sep 13, 2023 · 6 comments · Fixed by #298
Closed

Add surrogate models that can extrapolate #286

dlinzner-bcs opened this issue Sep 13, 2023 · 6 comments · Fixed by #298
Labels
enhancement New feature or request

Comments

@dlinzner-bcs
Copy link
Contributor

I encountered the problem multiple times that no in-spec points were found in the screening window and we needed to extrapolate using a linear model. I needed to do some workarounds from there on. I think adding such models (e.g. BayesianRidge) would be really helpful.

As I understand that feature is still missing? An other way to tackle this might be to have a "Bring Your Own Model" feature - which I am also not aware of..

@dlinzner-bcs dlinzner-bcs added the enhancement New feature or request label Sep 13, 2023
@jduerholt
Copy link
Contributor

Hi Dominik,

Linear models are currently supported (https://github.com/experimental-design/bofire/blob/main/bofire/data_models/surrogates/linear.py), which is just an GP with a linear kernel.

Note that I also encountered several times that the botorch priors tend to overfit (which was already noted by @DavidWalz or @bertiqwerty in MBO: https://github.com/basf/mbo/blob/d4061858c947af8deaa5ee8e4118615cb9328d02/mbo/torch_tools.py#L19). You can use these MBO priors also for the GP in bofire by just building them with the mbo priors: https://github.com/experimental-design/bofire/blob/main/bofire/data_models/priors/api.py

surrogate_data = SingleTaskGPSurrogate(
    inputs = domain.inputs, 
    outputs = outputs,
    kernel=ScaleKernel(base_kernel=RBFKernel(ard=True, lengthscale_prior=MBO_LENGTHCALE_PRIOR()),outputscale_prior=MBO_OUTPUTSCALE_PRIOR()),
    noise_prior=MBO_NOISE_PRIOR()
)

Often these priors generalize better. To automatically test this, you can use the hyperoptimize method on GP surrogate data:

from bofire.benchmarks.api import hyperoptimize

opt_surrogate_data, purity_metrics =  hyperoptimize(
    surrogate_data =  SingleTaskGPSurrogate(inputs=domain.inputs, outputs = outputs),
    training_data = experiments,
    folds = 5)

This will test certain combinations of priors and kernels and return the best found surrogate data and a data frame with the performance of the tested hyperparameters.

Also the bring your own model option is available. You can code up your own models in botorch and give them to bofire. Starting in cell 20, you find an example in this notebook: https://github.com/experimental-design/bofire/blob/main/tutorials/models_serial.ipynb.

If you have more questions/ideas etc. just let me know.

Best,

Johannes

@dlinzner-bcs
Copy link
Contributor Author

dlinzner-bcs commented Sep 21, 2023

Thank you Johannes! Can you please point me to an example on how to use a surrogate in combination with a strategy for optimization? I tried the following and get errors

from bofire.data_models.surrogates.api import LinearSurrogate, BotorchSurrogates
from bofire.data_models.strategies.api import (
    QparegoStrategy,
)

qparego_data_model = QparegoStrategy(
    domain=domain,
    surrogate_specs=BotorchSurrogates(
        surrogates=[
            LinearSurrogate(
                inputs=domain.inputs, outputs=Outputs(features=[domain.outputs[0]])
            ),
            LinearSurrogate(
                inputs=domain.inputs, outputs=Outputs(features=[domain.outputs[1]])
            ),
        ]
    ),
)

@jduerholt
Copy link
Contributor

This is a bug, this PR should fix it: #290. Can you please review it?

As a workaround, you can also just use a SingleTaskGPSurrogate with a linear kernel, this is the same:

strategy_data = QnehviStrategyDataModel(
    domain=benchmark.domain,
    surrogate_specs=BotorchSurrogates(
        surrogates=[
            SingleTaskGPSurrogate(
                inputs=benchmark.domain.inputs,
                outputs=Outputs(features=[benchmark.domain.outputs[0]]),
                kernel=ScaleKernel(base_kernel=RBFKernel(ard=False))
            ),
            SingleTaskGPSurrogate(
                inputs=benchmark.domain.inputs,
                outputs=Outputs(features=[benchmark.domain.outputs[1]]),
                kernel=LinearKernel()
            )
        ]
    )
)

@dlinzner-bcs
Copy link
Contributor Author

Thank you @jduerholt ! The linear model now works for me. Is it also possible to implement a QuadraticKernel() using our current setup? I want to use a quadratric surrogate i.e. assume y = W[a, b, ab, a^2, b^2].T . I thought initially to use a linear model with respective constraints - but these would be nonlinear. Many thanks again!

@dlinzner-bcs
Copy link
Contributor Author

This looks like what I want. Do you think it makes sense to implement it? (I can probably do it)

@jduerholt
Copy link
Contributor

It makes definitely sense to implement it. It is on my list for a long time already and should be quite easy to do so. Just have a look how for example the linear kernel is implemented and do it in the same way: https://github.com/experimental-design/bofire/blob/main/bofire/kernels/mapper.py

@dlinzner-bcs dlinzner-bcs linked a pull request Oct 5, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants