Slow optimization speed for high-dim problems with NChooseKs #450

jduerholt · 2024-10-22T19:18:51Z

Using NChooseK constraints in high-dim optimizations results in very slow candidate generation. We are treating NChooseK constraints as nonlinear constraints using the same continuous relaxations as in this paper: https://arxiv.org/abs/2203.01900.

bofire/bofire/utils/torch_tools.py

Line 136 in 27e4693

    
           def get_nchoosek_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:

When using nonlinear constraints in botorch, one has to provide starting points that fulfil the constraints. This is done by using BoFire's RandomStrategy:

bofire/bofire/strategies/predictives/botorch.py

Line 248 in 27e4693

ic_gen_kwargs = {

The RandomStrategy is very inefficient when using NChooseK's in high dimensions, which can be easily shown by this script, which runs essentially forever:

import bofire.strategies.api as strategies
from bofire.data_models.api import Domain
from bofire.data_models.features.api import ContinuousInput
from bofire.data_models.constraints.api import NChooseKConstraint
from bofire.data_models.strategies.api import RandomStrategy

N_INPUTS = 50

keys = [f"x{i}" for i in range(0,N_INPUTS)]

domain = Domain(
    inputs=[ContinuousInput(key=key, bounds=(0,1)) for key in keys],
    constraints=[NChooseKConstraint(features=keys, min_count=0, max_count=5, none_also_valid=True)]
)

strategy_data = RandomStrategy(domain=domain)
strategy = strategies.map(strategy_data)

strategy.ask(10)

When reducint N_INPUTS to smaller numbers like 20, it start to work again. The reason for the slow performance is that the random strategy generates all possible combinations for the NChooseK constraints:

bofire/bofire/strategies/random.py

Line 124 in 27e4693

_, unused = self.domain.get_nchoosek_combinations()

For this purpose, it uses domain.get_nchoosek_combinations which is super slow legacy code which was not written for high-dim problem settings. So to speed up the optimization, the generation of the valid starting points for the ACQF implementation has to be improved, or one has to swich to alternative solutions like SEBO, nevertheless a tidy-up of this part of the code is definitely necessary.

cc: @CompRhys

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow optimization speed for high-dim problems with NChooseKs #450

Slow optimization speed for high-dim problems with NChooseKs #450

jduerholt commented Oct 22, 2024

Slow optimization speed for high-dim problems with NChooseKs #450

Slow optimization speed for high-dim problems with NChooseKs #450

Comments

jduerholt commented Oct 22, 2024