Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCSA/MMA improvements: docs, rho_init, initial_step #561

Merged
merged 5 commits into from
Aug 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions doc/docs/NLopt_Algorithms.md
Original file line number Diff line number Diff line change
Expand Up @@ -305,18 +305,21 @@ My implementation of the globally-convergent method-of-moving-asymptotes (MMA) a

This is an improved CCSA ("conservative convex separable approximation") variant of the original MMA algorithm published by Svanberg in 1987, which has become popular for topology optimization. (*Note:* "globally convergent" does *not* mean that this algorithm converges to the global optimum; it means that it is guaranteed to converge to *some* local minimum from any feasible starting point.)

At each point **x**, MMA forms a local approximation using the gradient of *f* and the constraint functions, plus a quadratic "penalty" term to make the approximations "conservative" (upper bounds for the exact functions). The precise approximation MMA forms is difficult to describe in a few words, because it includes nonlinear terms consisting of a poles at some distance from *x* (outside of the current trust region), almost a kind of Padé approximant. The main point is that the approximation is both convex and separable, making it trivial to solve the approximate optimization by a dual method. Optimizing the approximation leads to a new candidate point **x**. The objective and constraints are evaluated at the candidate point. If the approximations were indeed conservative (upper bounds for the actual functions at the candidate point), then the process is restarted at the new **x**. Otherwise, the approximations are made more conservative (by increasing the penalty term) and re-optimized.
At each point **x**, CCSA forms a local approximation using the gradient of *f* and the constraint functions, plus a quadratic "penalty" term to make the approximations "conservative" (upper bounds for the exact functions). The precise approximation MMA forms is difficult to describe in a few words, because it includes nonlinear terms consisting of a poles at some distance from *x* (outside of the current trust region), almost a kind of Padé approximant. The main point is that the approximation is both convex and separable, making it trivial to solve the approximate optimization by a dual method. Optimizing the approximation leads to a new candidate point **x**. The objective and constraints are evaluated at the candidate point. If the approximations were indeed conservative (upper bounds for the actual functions at the candidate point), then the process is restarted at the new **x**. Otherwise, the approximations are made more conservative (by increasing the penalty term) and re-optimized.

(If you contact [Professor Svanberg](https://people.kth.se/~krille/), he has been willing in the past to graciously provide you with his original code, albeit under restrictions on commercial use or redistribution. The MMA implementation in NLopt, however, is completely independent of Svanberg's, whose code we have not examined; any bugs are my own, of course.)

I also implemented another CCSA algorithm from the same paper, `NLOPT_LD_CCSAQ`: instead of constructing local MMA approximations, it constructs simple quadratic approximations (or rather, affine approximations plus a quadratic penalty term to stay conservative). This is the ccsa_quadratic code. It seems to have similar convergence rates to MMA for most problems, which is not surprising as they are both essentially similar. However, for the quadratic variant I implemented the possibility of [preconditioning](NLopt_Reference.md#preconditioning-with-approximate-hessians): including a user-supplied Hessian approximation in the local model. It is easy to incorporate this into the proof in Svanberg's paper, and to show that global convergence is still guaranteed as long as the user's "Hessian" is positive semidefinite, and in practice it can greatly improve convergence if the preconditioner is a good approximation for the real Hessian (at least for the eigenvectors of the largest eigenvalues).
I also implemented another CCSA algorithm from the same paper, `NLOPT_LD_CCSAQ`: instead of constructing local MMA approximations, it constructs simple quadratic approximations (or rather, affine approximations plus a quadratic penalty term to stay conservative). It seems to have similar convergence rates to MMA for most problems, which is not surprising as they are both essentially similar. My CCSAQ implementation seems to behave better in some ways that the MMA variant, in fact, so I generally recommend trying `NLOPT_LD_CCSAQ` first rather than `NLOPT_LD_MMA`.

For the quadratic variant I also implemented the possibility of [preconditioning](NLopt_Reference.md#preconditioning-with-approximate-hessians): including a user-supplied Hessian approximation in the local model. It is easy to incorporate this into the proof in Svanberg's paper, and to show that global convergence is still guaranteed as long as the user's "Hessian" is positive semidefinite, and in practice it can greatly improve convergence if the preconditioner is a good approximation for the real Hessian (at least for the eigenvectors of the largest eigenvalues).

The `NLOPT_LD_MMA` and `NLOPT_LD_CCSAQ` algorithms support the following internal parameters, which can be
specified using the [`nlopt_set_param` API](NLopt_Reference#algorithm-specific-parameters):

* `inner_maxeval`: If ≥ 0, gives maximum number of "inner" iterations of the algorithm where it tries to ensure that its approximatations are "conservative"; defaults to `0` (no limit). It can be useful to specify a finite number (e.g. `5` or `10`) for this parameter if inaccuracies in your gradient or objective function are preventing the algorithm from making progress.
* `dual_algorithm` (defaults to `NLOPT_LD_MMA`), `dual_ftol_rel` (defaults to `1e-14`), `dual_ftol_abs` (defaults to `0`), `dual_xtol_rel` (defaults to `0`), `dual_xtol_abs` (defaults to `0`), `dual_maxeval` (defaults to `100000`): These specify how the algorithm internally solves the "dual" optimization problem for its approximate objective. Because this subsidiary solve requires no evaluations of the user's objective function, it is typically fast enough that we can solve it to high precision without worrying too much about the details. Howeve,r in high-dimensional problems you may notice that MMA/CCSA is taking a long time between optimization steps, in which case you may want to increase `dual_ftol_rel` or make other changes. If these parameters are not specified, NLopt takes them from the [subsidiary-optimizer algorithm](NLopt_Reference#localsubsidiary-optimization-algorithm) if that has been specified, and otherwise uses the defaults indicated here.
* `verbosity`: If > 0, causes the algorithm to print internal status information on each iteration.
* `rho_init`: if specified, should be a rough upper bound for the second derivative (the biggest eigenvalue of the Hessian of the objective or constraints); defaults to `1.0`. CCSA/MMA will adaptively adjust this as the optimization progresses, so even it if `rho_init` is completely wrong the algorithm will still converge. A `rho_init` that is too large will cause the algorithm to take overly small steps at the beginning, while a `rho_init` that is too small will cause it to take overly large steps (and have to backtrack) at the beginning. Similarly, you can also use the "initial stepsize" option ([NLopt reference](NLopt_Reference.md#initial-step-size)) to control the maximum size of the initial steps (half the diameter of the trust region).

### SLSQP

Expand Down Expand Up @@ -396,5 +399,3 @@ The subsidiary optimization algorithm is specified by the `nlopt_set_local_optim
The augmented Lagrangian method is specified in NLopt as `NLOPT_AUGLAG`. We also provide a variant, `NLOPT_AUGLAG_EQ`, that only uses penalty functions for equality constraints, while inequality constraints are passed through to the subsidiary algorithm to be handled directly; in this case, the subsidiary algorithm must handle inequality constraints (e.g. MMA or COBYLA).

While NLopt uses an independent re-implementation of the Birgin and Martínez algorithm, those authors provide their own free-software implementation of the method as part of the [TANGO](http://www.ime.usp.br/~egbirgin/tango/) project, and implementations can also be found in [semi-free](http://www.gnu.org/philosophy/categories.html#semi-freeSoftware) packages like [LANCELOT](http://www.numerical.rl.ac.uk/lancelot/blurb.html).


11 changes: 7 additions & 4 deletions src/algs/mma/ccsa_quadratic.c
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,8 @@ nlopt_result ccsa_quadratic_minimize(
double *x, /* in: initial guess, out: minimizer */
double *minf,
nlopt_stopping *stop,
nlopt_opt dual_opt, int inner_maxeval, unsigned verbose)
nlopt_opt dual_opt, int inner_maxeval, unsigned verbose, double rho_init,
const double *sigma_init)
{
nlopt_result ret = NLOPT_SUCCESS;
double *xcur, rho, *sigma, *dfdx, *dfdx_cur, *xprev, *xprevprev, fcur;
Expand Down Expand Up @@ -322,14 +323,16 @@ nlopt_result ccsa_quadratic_minimize(
}

for (j = 0; j < n; ++j) {
if (nlopt_isinf(ub[j]) || nlopt_isinf(lb[j]))
if (sigma_init && sigma_init[j] > 0)
sigma[j] = sigma_init[j];
else if (nlopt_isinf(ub[j]) || nlopt_isinf(lb[j]))
sigma[j] = 1.0; /* arbitrary default */
else
sigma[j] = 0.5 * (ub[j] - lb[j]);
}
rho = 1.0;
rho = rho_init;
for (i = 0; i < m; ++i) {
rhoc[i] = 1.0;
rhoc[i] = rho_init;
dual_lb[i] = y[i] = 0.0;
dual_ub[i] = HUGE_VAL;
}
Expand Down
11 changes: 7 additions & 4 deletions src/algs/mma/mma.c
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,8 @@ nlopt_result mma_minimize(unsigned n, nlopt_func f, void *f_data,
double *x, /* in: initial guess, out: minimizer */
double *minf,
nlopt_stopping *stop,
nlopt_opt dual_opt, int inner_maxeval, unsigned verbose)
nlopt_opt dual_opt, int inner_maxeval, unsigned verbose, double rho_init,
const double *sigma_init)
{
nlopt_result ret = NLOPT_SUCCESS;
double *xcur, rho, *sigma, *dfdx, *dfdx_cur, *xprev, *xprevprev, fcur;
Expand Down Expand Up @@ -198,14 +199,16 @@ nlopt_result mma_minimize(unsigned n, nlopt_func f, void *f_data,
dd.gcval = gcval;

for (j = 0; j < n; ++j) {
if (nlopt_isinf(ub[j]) || nlopt_isinf(lb[j]))
if (sigma_init && sigma_init[j] > 0)
sigma[j] = sigma_init[j];
else if (nlopt_isinf(ub[j]) || nlopt_isinf(lb[j]))
sigma[j] = 1.0; /* arbitrary default */
else
sigma[j] = 0.5 * (ub[j] - lb[j]);
}
rho = 1.0;
rho = rho_init;
for (i = 0; i < m; ++i) {
rhoc[i] = 1.0;
rhoc[i] = rho_init;
dual_lb[i] = y[i] = 0.0;
dual_ub[i] = HUGE_VAL;
}
Expand Down
7 changes: 4 additions & 3 deletions src/algs/mma/mma.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ nlopt_result mma_minimize(unsigned n, nlopt_func f, void *f_data,
double *x, /* in: initial guess, out: minimizer */
double *minf,
nlopt_stopping *stop,
nlopt_opt dual_opt, int inner_maxeval, unsigned verbose);
nlopt_opt dual_opt, int inner_maxeval, unsigned verbose, double rho_init,
const double *sigma_init);

nlopt_result ccsa_quadratic_minimize(
unsigned n, nlopt_func f, void *f_data,
Expand All @@ -52,11 +53,11 @@ nlopt_result ccsa_quadratic_minimize(
double *x, /* in: initial guess, out: minimizer */
double *minf,
nlopt_stopping *stop,
nlopt_opt dual_opt, int inner_maxeval, unsigned verbose);
nlopt_opt dual_opt, int inner_maxeval, unsigned verbose, double rho_init,
const double *sigma_init);

#ifdef __cplusplus
} /* extern "C" */
#endif /* __cplusplus */

#endif

13 changes: 10 additions & 3 deletions src/api/optimize.c
Original file line number Diff line number Diff line change
Expand Up @@ -656,8 +656,16 @@ static nlopt_result nlopt_optimize_(nlopt_opt opt, double *x, double *minf)
case NLOPT_LD_MMA:
case NLOPT_LD_CCSAQ:
{
int inner_maxeval = (int)nlopt_get_param(opt, "inner_maxeval",0);
int verbosity = (int)nlopt_get_param(opt, "verbosity",0);
double rho_init = nlopt_get_param(opt, "rho_init",1.0);
nlopt_opt dual_opt;
nlopt_result ret;

if (!(rho_init > 0) && !nlopt_isinf(rho_init))
RETURN_ERR(NLOPT_INVALID_ARGS, opt, "rho_init must be positive and finite");
verbosity = verbosity < 0 ? 0 : verbosity;

#define LO(param, def) (opt->local_opt ? opt->local_opt->param : (def))
dual_opt = nlopt_create((nlopt_algorithm)nlopt_get_param(opt, "dual_algorithm", LO(algorithm, nlopt_local_search_alg_deriv)),
nlopt_count_constraints(opt->m, opt->fc));
Expand All @@ -669,11 +677,10 @@ static nlopt_result nlopt_optimize_(nlopt_opt opt, double *x, double *minf)
nlopt_set_xtol_abs1(dual_opt, nlopt_get_param(opt, "dual_xtol_abs", 0.0));
nlopt_set_maxeval(dual_opt, nlopt_get_param(opt, "dual_maxeval", LO(maxeval, 100000)));
#undef LO

if (algorithm == NLOPT_LD_MMA)
ret = mma_minimize(n, f, f_data, opt->m, opt->fc, lb, ub, x, minf, &stop, dual_opt, (int)nlopt_get_param(opt, "inner_maxeval",0), (unsigned)nlopt_get_param(opt, "verbosity",0));
ret = mma_minimize(n, f, f_data, opt->m, opt->fc, lb, ub, x, minf, &stop, dual_opt, inner_maxeval, (unsigned)verbosity, rho_init, opt->dx);
else
ret = ccsa_quadratic_minimize(n, f, f_data, opt->m, opt->fc, opt->pre, lb, ub, x, minf, &stop, dual_opt, (int)nlopt_get_param(opt, "inner_maxeval",0), (unsigned)nlopt_get_param(opt, "verbosity",0));
ret = ccsa_quadratic_minimize(n, f, f_data, opt->m, opt->fc, opt->pre, lb, ub, x, minf, &stop, dual_opt, inner_maxeval, (unsigned)verbosity, rho_init, opt->dx);
nlopt_destroy(dual_opt);
return ret;
}
Expand Down
4 changes: 4 additions & 0 deletions test/t_tutorial.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,10 @@ int main() {
return EXIT_FAILURE;
}

// set a couple of other parameters
opt.set_param("rho_init", 0.5);
opt.set_initial_step(0.1);

std::vector<double> x(2);
x[0] = 1.234; x[1] = 5.678;
double minf;
Expand Down
9 changes: 8 additions & 1 deletion test/testopt.c
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ extern "C" int feenableexcept(int EXCEPTS);

static nlopt_algorithm algorithm = NLOPT_GN_DIRECT_L;
static double ftol_rel = 0, ftol_abs = 0, xtol_rel = 0, xtol_abs = 0, minf_max_delta;
static double initial_step = 0;
static int maxeval = 1000, iterations = 1, center_start = 0;
static double maxtime = 0.0;
static double xinit_tol = -1;
Expand Down Expand Up @@ -235,6 +236,8 @@ static int test_function(int ifunc)
nlopt_set_xtol_abs(opt, xtabs);
nlopt_set_maxeval(opt, maxeval);
nlopt_set_maxtime(opt, maxtime);
if (initial_step != 0)
nlopt_set_initial_step1(opt, initial_step);
ret = nlopt_optimize(opt, x, &minf);
printf("finished after %g seconds.\n", nlopt_seconds() - start);
printf("return code %d from nlopt_minimize\n", ret);
Expand Down Expand Up @@ -290,6 +293,7 @@ static void usage(FILE * f)
" -a <n> : use optimization algorithm <n>\n"
" -o <n> : use objective function <n>\n"
" -0 <x> : starting guess within <x> + (1+<x>) * optimum\n"
" -S <dx>: initial step size dx (default: none)\n"
" -b <dim0,dim1,...>: eliminate given dims by equating bounds\n");
fprintf(f,
" -c : starting guess at center of cell\n"
Expand Down Expand Up @@ -321,7 +325,7 @@ int main(int argc, char **argv)
feenableexcept(FE_INVALID);
#endif

while ((c = getopt(argc, argv, "hLvVCc0:r:a:o:i:e:t:x:X:f:F:m:b:")) != -1)
while ((c = getopt(argc, argv, "hLvVCc0:r:a:o:i:e:t:x:X:f:F:m:b:S:")) != -1)
switch (c) {
case 'h':
usage(stdout);
Expand Down Expand Up @@ -389,6 +393,9 @@ int main(int argc, char **argv)
center_start = 0;
xinit_tol = atof(optarg);
break;
case 'S':
initial_step = atof(optarg);
break;
case 'b':{
const char *s = optarg;
while (s && *s) {
Expand Down
Loading