Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An attempt to fix the current CMAES inconsistencies #351

Merged
merged 21 commits into from
Jul 24, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
670ebb2
attempt to fix current cmaes inconsistencies
SuvarshaChennareddy Jan 2, 2023
f28fd18
update tests
SuvarshaChennareddy Jan 2, 2023
39feebb
fix style, add deprecated constructor, and update tests
SuvarshaChennareddy Jan 16, 2023
4a42820
change test name
SuvarshaChennareddy Jan 16, 2023
21e2e89
remove unused code
SuvarshaChennareddy Jan 19, 2023
43f5146
update documentation, add ens_deprecated, and fix errors
SuvarshaChennareddy Apr 27, 2023
5a00748
update HISTORY.md
SuvarshaChennareddy Apr 27, 2023
ac10206
Merge branch 'master' into cmaes-fix
zoq May 24, 2023
d42f985
Tiny style fix for name of function: initialStepSize -> InitialStepSize.
rcurtin May 29, 2023
c88e0c3
Remove some more tab characters I found...
rcurtin May 29, 2023
46565e7
Compilation fixes for tests.
rcurtin May 29, 2023
716cea6
fix bug and modify tests
SuvarshaChennareddy May 31, 2023
e2e9065
add include gaurds for transformation policies
SuvarshaChennareddy Jun 9, 2023
26c1a23
update tests
SuvarshaChennareddy Jun 12, 2023
b49b03d
remove empty line
SuvarshaChennareddy Jun 12, 2023
1c4a4db
update tests
SuvarshaChennareddy Jun 13, 2023
5623d04
Update hyperparameters used in tests
SuvarshaChennareddy Jun 14, 2023
a6fb391
add missing comment
SuvarshaChennareddy Jul 10, 2023
b791390
Merge branch 'master' into cmaes-fix
SuvarshaChennareddy Jul 10, 2023
8a4e2e6
update population size used for CMAESLogisticRegressionFMatTest
SuvarshaChennareddy Jul 12, 2023
8ea9d06
add patience and update termination conditions
SuvarshaChennareddy Jul 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 29 additions & 28 deletions include/ensmallen_bits/cmaes/cmaes.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@

#include "full_selection.hpp"
#include "random_selection.hpp"
#include "transformation_policies/empty_transformation.hpp"
#include "transformation_policies/boundary_box_constraint.hpp"

namespace ens {

Expand Down Expand Up @@ -46,8 +48,11 @@ namespace ens {
* ensmallen website.
*
* @tparam SelectionPolicy The selection strategy used for the evaluation step.
* @tparam transformationPolicy The transformation strategy used to
* map cooridnates to the desired domain.
SuvarshaChennareddy marked this conversation as resolved.
Show resolved Hide resolved
*/
template<typename SelectionPolicyType = FullSelection>
template<typename SelectionPolicyType = FullSelection,
typename TransformationPolicyType = EmptyTransformation<>>
class CMAES
{
public:
Expand All @@ -60,8 +65,8 @@ class CMAES
* equal one pass over the dataset).
*
* @param lambda The population size (0 use the default size).
* @param lowerBound Lower bound of decision variables.
* @param upperBound Upper bound of decision variables.
* @param transformationPolicy Instantiated transformation policy used to
* map the cooridnates to the desired domain.
SuvarshaChennareddy marked this conversation as resolved.
Show resolved Hide resolved
* @param batchSize Batch size to use for the objective calculation.
* @param maxIterations Maximum number of iterations allowed (0 means no
* limit).
Expand All @@ -70,8 +75,8 @@ class CMAES
* objective.
*/
CMAES(const size_t lambda = 0,
const double lowerBound = -10,
const double upperBound = 10,
const TransformationPolicyType&
transformationPolicy = TransformationPolicyType(),
SuvarshaChennareddy marked this conversation as resolved.
Show resolved Hide resolved
const size_t batchSize = 32,
const size_t maxIterations = 1000,
const double tolerance = 1e-5,
Expand All @@ -87,31 +92,23 @@ class CMAES
* @tparam CallbackTypes Types of callback functions.
* @param function Function to optimize.
* @param iterate Starting point (will be modified).
* @param stepSize Starting sigma/step size (will be modified).
* @param callbacks Callback functions.
* @return Objective value of the final point.
*/
template<typename SeparableFunctionType,
typename MatType,
typename... CallbackTypes>
typename MatType::elem_type Optimize(SeparableFunctionType& function,
MatType& iterate,
CallbackTypes&&... callbacks);
typename MatType,
typename... CallbackTypes>
typename MatType::elem_type Optimize(SeparableFunctionType& function,
MatType& iterate,
double stepSize = 0.6,
SuvarshaChennareddy marked this conversation as resolved.
Show resolved Hide resolved
CallbackTypes&&... callbacks);

//! Get the population size.
size_t PopulationSize() const { return lambda; }
//! Modify the population size.
size_t& PopulationSize() { return lambda; }

//! Get the lower bound of decision variables.
double LowerBound() const { return lowerBound; }
//! Modify the lower bound of decision variables.
double& LowerBound() { return lowerBound; }

//! Get the upper bound of decision variables
double UpperBound() const { return upperBound; }
//! Modify the upper bound of decision variables
double& UpperBound() { return upperBound; }

//! Get the batch size.
size_t BatchSize() const { return batchSize; }
//! Modify the batch size.
Expand All @@ -132,16 +129,17 @@ class CMAES
//! Modify the selection policy.
SelectionPolicyType& SelectionPolicy() { return selectionPolicy; }

//! Get the transformation policy.
const TransformationPolicyType& TransformationPolicy() const
{ return transformationPolicy; }
//! Modify the transformation policy.
TransformationPolicyType& TransformationPolicy()
{ return transformationPolicy; }

private:
//! Population size.
size_t lambda;

//! Lower bound of decision variables.
double lowerBound;

//! Upper bound of decision variables
double upperBound;

//! The batch size for processing.
size_t batchSize;

Expand All @@ -153,13 +151,16 @@ class CMAES

//! The selection policy used to calculate the objective.
SelectionPolicyType selectionPolicy;

TransformationPolicyType transformationPolicy;
rcurtin marked this conversation as resolved.
Show resolved Hide resolved
};

/**
* Convenient typedef for CMAES approximation.
*/
template<typename SelectionPolicyType = RandomSelection>
using ApproxCMAES = CMAES<SelectionPolicyType>;
template<typename TransformationPolicyType = EmptyTransformation<>,
typename SelectionPolicyType = RandomSelection>
using ApproxCMAES = CMAES<SelectionPolicyType, TransformationPolicyType>;

} // namespace ens

Expand Down
59 changes: 36 additions & 23 deletions include/ensmallen_bits/cmaes/cmaes_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,31 +22,32 @@

namespace ens {

template<typename SelectionPolicyType>
CMAES<SelectionPolicyType>::CMAES(const size_t lambda,
const double lowerBound,
const double upperBound,
template<typename SelectionPolicyType, typename TransformationPolicyType>
CMAES<SelectionPolicyType, TransformationPolicyType>::CMAES(const size_t lambda,
const TransformationPolicyType&
transformationPolicy,
const size_t batchSize,
const size_t maxIterations,
const double tolerance,
const SelectionPolicyType& selectionPolicy) :
lambda(lambda),
lowerBound(lowerBound),
upperBound(upperBound),
batchSize(batchSize),
maxIterations(maxIterations),
tolerance(tolerance),
selectionPolicy(selectionPolicy)
selectionPolicy(selectionPolicy),
transformationPolicy(transformationPolicy)
{ /* Nothing to do. */ }

//! Optimize the function (minimize).
template<typename SelectionPolicyType>
template<typename SelectionPolicyType, typename TransformationPolicyType>
template<typename SeparableFunctionType,
typename MatType,
typename... CallbackTypes>
typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(
typename MatType::elem_type CMAES<SelectionPolicyType,
TransformationPolicyType>::Optimize(
SeparableFunctionType& function,
MatType& iterateIn,
double stepSizeIn,
CallbackTypes&&... callbacks)
{
// Convenience typedefs.
Expand Down Expand Up @@ -78,7 +79,7 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(

// Step size control parameters.
BaseMatType sigma(2, 1); // sigma is vector-shaped.
sigma(0) = 0.3 * (upperBound - lowerBound);
sigma(0) = stepSizeIn; //0.3 * (upperBound - lowerBound);
const double cs = (muEffective + 2) / (iterate.n_elem + muEffective + 5);
const double ds = 1 + cs + 2 * std::max(std::sqrt((muEffective - 1) /
(iterate.n_elem + 1)) - 1, 0.0);
Expand All @@ -99,8 +100,7 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(

std::vector<BaseMatType> mPosition(2, BaseMatType(iterate.n_rows,
iterate.n_cols));
mPosition[0] = lowerBound + arma::randu<BaseMatType>(
iterate.n_rows, iterate.n_cols) * (upperBound - lowerBound);
mPosition[0] = iterate;
SuvarshaChennareddy marked this conversation as resolved.
Show resolved Hide resolved

BaseMatType step(iterate.n_rows, iterate.n_cols);
step.zeros();
Expand All @@ -110,11 +110,13 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(
for (size_t f = 0; f < numFunctions; f += batchSize)
{
const size_t effectiveBatchSize = std::min(batchSize, numFunctions - f);
const ElemType objective = function.Evaluate(mPosition[0], f,
const ElemType objective =
function.Evaluate(transformationPolicy.Transform(mPosition[0]), f,
effectiveBatchSize);
currentObjective += objective;

Callback::Evaluate(*this, function, mPosition[0], objective,
Callback::Evaluate(*this, function,
transformationPolicy.Transform(mPosition[0]), objective,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
transformationPolicy.Transform(mPosition[0]), objective,
transformationPolicy.Transform(mPosition[0]), objective,

Minor style fix: wrapped lines should be doubly indented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also worth pointing out is that you call transformationPolicy.Transform() twice here for each inner iteration of the loop. Does it perhaps make more sense to call Transform() once, before the loop? That may require some further refactoring, but it seems to me to be conceptually simpler to simply use Transform() to map each new proposed iterate back into the required range right after a step is taken, only once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yes! Sorry and thank you for pointing this out! I will fix this very soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may require some further refactoring, but it seems to me to be conceptually simpler to simply use Transform() to map each new proposed iterate back into the required range right after a step is taken, only once.

This would make the update steps depend on the transformation policy, which isn't really my intention here.

callbacks...);
}

Expand Down Expand Up @@ -146,10 +148,12 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(
// Controls early termination of the optimization process.
bool terminate = false;

BaseMatType transformedIterate = transformationPolicy.Transform(iterate);

// Now iterate!
terminate |= Callback::BeginOptimization(*this, function, iterate,
callbacks...);
for (size_t i = 1; i < maxIterations && !terminate; ++i)
terminate |= Callback::BeginOptimization(*this, function,
transformedIterate, callbacks...);
SuvarshaChennareddy marked this conversation as resolved.
Show resolved Hide resolved
for (size_t i = 1; (i != maxIterations) && !terminate; ++i)
{
// To keep track of where we are.
const size_t idx0 = (i - 1) % 2;
Expand All @@ -161,6 +165,8 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(
while (!arma::chol(covLower, C[idx0], "lower"))
C[idx0].diag() += std::numeric_limits<ElemType>::epsilon();

arma::eig_sym(eigval, eigvec, C[idx0]);

for (size_t j = 0; j < lambda; ++j)
{
if (iterate.n_rows > iterate.n_cols)
Expand All @@ -171,14 +177,14 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(
else
{
pStep[idx(j)] = arma::randn<BaseMatType>(iterate.n_rows, iterate.n_cols)
* covLower;
* covLower.t();
SuvarshaChennareddy marked this conversation as resolved.
Show resolved Hide resolved
}

pPosition[idx(j)] = mPosition[idx0] + sigma(idx0) * pStep[idx(j)];

// Calculate the objective function.
pObjective(idx(j)) = selectionPolicy.Select(function, batchSize,
pPosition[idx(j)], callbacks...);
transformationPolicy.Transform(pPosition[idx(j)]), callbacks...);
}

// Sort population.
Expand All @@ -192,27 +198,31 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(

// Calculate the objective function.
currentObjective = selectionPolicy.Select(function, batchSize,
mPosition[idx1], callbacks...);
transformationPolicy.Transform(mPosition[idx1]), callbacks...);
SuvarshaChennareddy marked this conversation as resolved.
Show resolved Hide resolved

// Update best parameters.
if (currentObjective < overallObjective)
{
overallObjective = currentObjective;
iterate = mPosition[idx1];

terminate |= Callback::StepTaken(*this, function, iterate, callbacks...);
transformedIterate = transformationPolicy.Transform(iterate);
terminate |= Callback::StepTaken(*this, function,
transformedIterate, callbacks...);
}

// Update Step Size.
if (iterate.n_rows > iterate.n_cols)
{
ps[idx1] = (1 - cs) * ps[idx0] + std::sqrt(
cs * (2 - cs) * muEffective) * covLower.t() * step;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is the compute C^{-1/2}, it should suffice to use inv(covLower): https://math.stackexchange.com/questions/1230051/inverse-square-root-of-matrix

I think Armadillo will solve the system more quickly too if you specify trimatl() (since it is a lower triangular matrix): inv(trimatl(covLower)).

I haven't checked this exactly, but it should at least be in the right direction. You should verify my comment here instead of trusting it to be correct... 😄

Copy link
Contributor Author

@SuvarshaChennareddy SuvarshaChennareddy Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Nikolaus Hansen's The CMA Evolution Strategy: A Tutorial C^{-1/2} is defined to be B * D^{-1} * B^T which isn't always equal to the inverse of covLower = B * D^{1/2}.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern here really comes from the fact that we are computing both the Cholesky decomposition and the eigendecomposition. I'd prefer to use one or the other, because these tend to be expensive operations. However, I haven't worked out the math to either (a) express C^{-1/2} in terms of the Cholesky decomposition or (b) express the earlier operations that use covLower in terms of the eigendecomposition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand and agree. I haven't had much time to think about this either but I'll try to figure something out soon.

cs * (2 - cs) * muEffective) *
eigvec * diagmat(1 / eigval) * eigvec.t() * step;
}
else
{
ps[idx1] = (1 - cs) * ps[idx0] + std::sqrt(
cs * (2 - cs) * muEffective) * step * covLower.t();
cs * (2 - cs) * muEffective) * step *
eigvec * diagmat(1 / eigval) * eigvec.t();
}

const ElemType psNorm = arma::norm(ps[idx1]);
Expand Down Expand Up @@ -293,6 +303,7 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(
Warn << "CMA-ES: converged to " << overallObjective << "; "
<< "terminating with failure. Try a smaller step size?" << std::endl;

iterate = transformationPolicy.Transform(iterate);
Callback::EndOptimization(*this, function, iterate, callbacks...);
return overallObjective;
}
Expand All @@ -302,13 +313,15 @@ typename MatType::elem_type CMAES<SelectionPolicyType>::Optimize(
Info << "CMA-ES: minimized within tolerance " << tolerance << "; "
<< "terminating optimization." << std::endl;

iterate = transformationPolicy.Transform(iterate);
Callback::EndOptimization(*this, function, iterate, callbacks...);
return overallObjective;
}

lastObjective = overallObjective;
}

iterate = transformationPolicy.Transform(iterate);
Callback::EndOptimization(*this, function, iterate, callbacks...);
return overallObjective;
}
Expand Down
Loading