Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow optimization speed for high-dim problems with NChooseKs #450

Open
jduerholt opened this issue Oct 22, 2024 · 0 comments
Open

Slow optimization speed for high-dim problems with NChooseKs #450

jduerholt opened this issue Oct 22, 2024 · 0 comments

Comments

@jduerholt
Copy link
Contributor

Using NChooseK constraints in high-dim optimizations results in very slow candidate generation. We are treating NChooseK constraints as nonlinear constraints using the same continuous relaxations as in this paper: https://arxiv.org/abs/2203.01900.

def get_nchoosek_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:

When using nonlinear constraints in botorch, one has to provide starting points that fulfil the constraints. This is done by using BoFire's RandomStrategy:

The RandomStrategy is very inefficient when using NChooseK's in high dimensions, which can be easily shown by this script, which runs essentially forever:

import bofire.strategies.api as strategies
from bofire.data_models.api import Domain
from bofire.data_models.features.api import ContinuousInput
from bofire.data_models.constraints.api import NChooseKConstraint
from bofire.data_models.strategies.api import RandomStrategy

N_INPUTS = 50

keys = [f"x{i}" for i in range(0,N_INPUTS)]

domain = Domain(
    inputs=[ContinuousInput(key=key, bounds=(0,1)) for key in keys],
    constraints=[NChooseKConstraint(features=keys, min_count=0, max_count=5, none_also_valid=True)]
)

strategy_data = RandomStrategy(domain=domain)
strategy = strategies.map(strategy_data)

strategy.ask(10)

When reducint N_INPUTS to smaller numbers like 20, it start to work again. The reason for the slow performance is that the random strategy generates all possible combinations for the NChooseK constraints:

_, unused = self.domain.get_nchoosek_combinations()

For this purpose, it uses domain.get_nchoosek_combinations which is super slow legacy code which was not written for high-dim problem settings. So to speed up the optimization, the generation of the valid starting points for the ACQF implementation has to be improved, or one has to swich to alternative solutions like SEBO, nevertheless a tidy-up of this part of the code is definitely necessary.

cc: @CompRhys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant