[QUESTION] Get the best predicted parameter (not observed) #1029

jultou-raa · 2022-07-21T11:50:59Z

Hi guys!

Playing with ax for a while, and I never found a "native" way to get a predicted optimal set of parameters.

For exemple : If I take a dummy function like $x^2+1$ and want to minimize it, I expect the optimal parameter to be $x=0$.

Using the AxClient API, I'm trying to recover the best parameter using ax_client.get_best_parameters(). But this returns the best observed data from completed trials. So here I get the black left point near 0...

Is-it possible to have something predicting the global optimum using the underlying model? Mean, a prediction of $x=0$ in my case?

If you want to play with this dataset, I give you a snapshot here

Thanks for your help!

The text was updated successfully, but these errors were encountered:

bernardbeckerman · 2022-07-21T17:53:04Z

Hi @jultou-raa, thanks for posting here! Let me follow up with the team and get back to you.

bernardbeckerman · 2022-07-21T20:28:09Z

Hi @jultou-raa, would you be able to share the code you're using to run AxClient API? There is a way to do this, but the best way depends on your modeling setup.

jultou-raa · 2022-07-22T07:53:47Z

Hi @bernardbeckerman :)

First of all, thank you for the quick reply !

For sure, I can explain the process followed for this example :)

Initialize the AxClient using a custom generation strategy (2 SOBOL + GPEI):

gs = GenerationStrategy(
                    steps = [
                        GenerationStep(
                            model=Models.SOBOL,
                            num_trials=2,
                            min_trials_observed=2,
                        ),
                        GenerationStep(
                            model=Models.GPEI,
                            num_trials=-1,
                            max_parallelism=nb_trials_batch,
                        ),
                    ]
                )
ax_client = AxClient(generation_strategy=gs)

Create experiment with this search space:

objectives:
  - obj
parameters:
  x:
    name: "x" 
    type: "range"
    bounds: [-1.0, 1.0]
    value_type: "float"

Then I simulate a Human-In-the-Loop by:
1. Asking the next trial using: ax_client.get_next_trial()
2. Computing with an external spreadsheet
3. Give the result to the ax_client using:
```
ax_client.complete_trial(
                    trial_index=trial,
                    raw_data=data,
                )
```

For my example, looping four times gaves me:

Trial Index	obj	Arm Name	x	Trial Status	Generation Method
0	1.007089604	0_0	0.08419978618621826	COMPLETED	Sobol
1	1.206375758	1_0	-0.45428598672151566	COMPLETED	Sobol
2	1.490542428	2_0	0.7003873413780071	COMPLETED	GPEI
3	1.006544824	3_0	-0.08090008723609299	COMPLETED	GPEI

Asking ax_client.get_best_parameters() returns me Trial#3(obj=1.006544824 which is the best observed indeed) but I expect something computed by the underlying Gaussian Process model.

Is it this kind of information you needed?

Thanks again for your help and all the Facebook/Ax team ! 👍

jultou-raa · 2022-07-26T05:05:38Z

Hi guys !
@bernardbeckerman, @lena-kashtelyan : Do you have some news on this topic ?

Maybe something is not clear for you in my previous comment ? If so please feel free to ask me again :)

Thanks for your help !

sgbaird · 2022-08-02T02:41:11Z

@jultou-raa one option that comes to mind is swapping out the acquisition function with PosteriorMean (assuming I'm understanding this acquisition function correctly) and then call get_next_trial(). Haven't tried it out, but may be worth a shot. You might also consider the upper confidence bound (UCB) acquisition function #955.

See also ModularBoTorchModel, #278, and #615. Maybe one of the Ax devs knows of a better way.

Also curious to know your use case for this. I've thought of doing something similar when it comes to fixed budget sizes, something to the effect of "every 10 iterations, try evaluating at the best-predicted location" to periodically demonstrate shorter-term gains to a stakeholder, especially for those less familiar with the efficiency of Bayesian optimization. Maybe a bad idea, but something that's come to mind.

bernardbeckerman · 2022-08-02T14:33:34Z

@sgbaird thanks for this response, and @jultou-raa sorry for the late reply! I agree with everything @sgbaird said, and also want to ask a bit more about your use case, particularly why you're looking for the modeled optimum rather than the optimum found so far. In the case that your goal is to do one final sample of the modeled optimum so as to get the best final result, I think this might not be the best strategy, since expected improvement is by definition the one-step optimal strategy for this purpose. Does that make sense? Also let me know if @sgbaird's solution works for you.

jultou-raa · 2022-08-10T13:07:01Z

Hi guys!

Thank you @sgbaird for this answer.

Tried both solution (Posterior mean and UCB) as you mentioned before.

This is the way I use it:

from ax.service.ax_client import AxClient
[...]
from ax.modelbridge.registry import Models
from botorch.acquisition import qUpperConfidenceBound, PosteriorMean
from ax.models.torch.botorch_modular.surrogate import Surrogate
from botorch.models.gp_regression import SingleTaskGP


ax_client = AxClient(generation_strategy=gs)  # See post https://github.com/facebook/Ax/issues/1029#issuecomment-1192290881 for gs variable details

[...]  # do the four first trials as mentioned above

gp_posterior_mean = Models.BOTORCH_MODULAR(
            experiment=ax_client.experiment,
            data=ax_client.experiment.fetch_data(),
            surrogate=Surrogate(SingleTaskGP),
            botorch_acqf_class=PosteriorMean,  # Here I tried PosteriorMean or qUpperConfidenceBound
        )

trial = gp_posterior_mean.gen(1)

trial.Arms[0].parameters

Doing this i have two different results (using one or the other) for my exemple above:

PosteriorMean gives me: 1.16E−03
qUpperConfidenceBound gives me: 2.20E−03

Now I have three questions 😄:

Am I doing the computation properly?
If I understand BoTorch documentation I can only use PosteriorMean with only one outcome? (q=1)
Is it possible to have something more like a centered interval for the optimal parameter (using $3\sigma$ for exemple)? Something like $x \in [-2.2E-03, 2.2E03]$ ?

@sgbaird and @bernardbeckerman, to answer you about why I need this. We do experiment that cost a lot, so the budget is small to reach the target. The idea behind is to finish an exploration/exploitation optimization (what GPEI or GPKG do) by a pure exploitation. for exemple if I got six shots to optimize my solution, it could be interesting for us to build the metamodel using 5 "smart" points and then the last one is a pure exploitation of the previous knowledge.

In the exemple I gave you in this thread, the last point Expected Improvement asked me was 0.0020908159615589117. So, it is a little better than qUpperConfidenceBound but less than PosteriorMean...

dme65 · 2022-09-15T15:39:00Z

Hi @jultou-raa,

Regarding your three questions:

Yes, what you are doing looks correct to me.
PosteriorMean assumes one outcome, but q=1 here corresponds to how many candidates you want to evaluate. That is, you won't be able to use PosteriorMean if you want to generate more than 1 candidate and evaluate those in parallel. That doesn't seem to be something you are interested in though given your description above.
I'm not sure if I understand what you mean here. Can you add some more details?

Taking a step back, every acquisition function you consider here generates a candidate close to zero and I wouldn't read too much into the fact that 1.16e−3 is slightly closer to zero than 2.20e−3. Given the situation you describe, EI is probably a natural choice here as it aims to maximize the expected improvement given one more function evaluation.

While it may feel like PosteriorMean is a natural choice when you have one evaluation left and want to focus on exploitation, here is a scenario where it will probably do the wrong thing: Assume your current best function value is f* and that the best posterior mean according to the model is also f*. Assume in addition that the uncertainty according to the model is 0 (the model is very confident in its prediction). Now, assume there is a second point with posterior mean f* + epsilon with epsilon>0 very close to zero, but that this point has very high uncertainty according to the model (the model is very unsure about its prediction). If you use PosteriorMean, it will ignore the model uncertainty and pick the point with posterior mean f*, which isn't a great choice since this point has no upside whatsoever. On the other hand, EI will end up picking the point with posterior mean f* + epsilon since this point has higher upside and may actually give you a sizable improvement compared to your current best point.

lena-kashtelyan · 2022-12-06T18:30:05Z

This seems answered and inactive, closing. Please feel free to reopen!

bernardbeckerman self-assigned this Jul 21, 2022

bernardbeckerman added the question Further information is requested label Jul 21, 2022

bernardbeckerman assigned dme65 and unassigned bernardbeckerman and dme65 Sep 13, 2022

lena-kashtelyan closed this as completed Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] Get the best predicted parameter (not observed) #1029

[QUESTION] Get the best predicted parameter (not observed) #1029

jultou-raa commented Jul 21, 2022 •

edited

Loading

bernardbeckerman commented Jul 21, 2022

bernardbeckerman commented Jul 21, 2022

jultou-raa commented Jul 22, 2022 •

edited

Loading

jultou-raa commented Jul 26, 2022

sgbaird commented Aug 2, 2022 •

edited

Loading

bernardbeckerman commented Aug 2, 2022

jultou-raa commented Aug 10, 2022 •

edited

Loading

dme65 commented Sep 15, 2022 •

edited

Loading

lena-kashtelyan commented Dec 6, 2022

[QUESTION] Get the best predicted parameter (not observed) #1029

[QUESTION] Get the best predicted parameter (not observed) #1029

Comments

jultou-raa commented Jul 21, 2022 • edited Loading

bernardbeckerman commented Jul 21, 2022

bernardbeckerman commented Jul 21, 2022

jultou-raa commented Jul 22, 2022 • edited Loading

jultou-raa commented Jul 26, 2022

sgbaird commented Aug 2, 2022 • edited Loading

bernardbeckerman commented Aug 2, 2022

jultou-raa commented Aug 10, 2022 • edited Loading

dme65 commented Sep 15, 2022 • edited Loading

lena-kashtelyan commented Dec 6, 2022

jultou-raa commented Jul 21, 2022 •

edited

Loading

jultou-raa commented Jul 22, 2022 •

edited

Loading

sgbaird commented Aug 2, 2022 •

edited

Loading

jultou-raa commented Aug 10, 2022 •

edited

Loading

dme65 commented Sep 15, 2022 •

edited

Loading