Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dirichlet Weight Initialization #296

Merged
merged 6 commits into from
Jun 23, 2021
Merged

Add Dirichlet Weight Initialization #296

merged 6 commits into from
Jun 23, 2021

Conversation

jonpsy
Copy link
Member

@jonpsy jonpsy commented Jun 18, 2021

Continuing #293

  • Added dirichilet_init.hpp
  • Test and verify locally.
  • using DirichletMOEAD = MOEAD<Dirichlet, Tchebycheff>
  • Document in dirichlet_init.hpp
  • Add documentation in optimizers.md
  • History.md

@jonpsy jonpsy changed the title Add Dirichlet Weight Initializatoin Add Dirichlet Weight Initialization Jun 18, 2021
Comment on lines 17 to 22
/**
* The Dirichlet method for initializing weights. Sampling a dirichlet distribution
* with parameters set to one returns points lying on unit simplex with uniform
* distribution.
*
*/
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zoq Not sure what should be the source ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is one we should reference here.

Copy link
Member

@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for the policy as well?

@jonpsy
Copy link
Member Author

jonpsy commented Jun 18, 2021

You have any specifics in mind?

@zoq
Copy link
Member

zoq commented Jun 18, 2021

We can just use the exciting one but change the policy.

@jonpsy
Copy link
Member Author

jonpsy commented Jun 18, 2021

I could test it against a difficult ZDT problem. See how it fares? The paper has tried testing against some ZDT problems. We can see how much the theoretical results match.

@zoq
Copy link
Member

zoq commented Jun 18, 2021

Right, let's test it one e.g. one ZDT problem.

const size_t numPoints,
const double epsilon)
{
MatType weights = arma::randg<MatType>(numObjectives, numPoints,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generation functions usually allow for the underlying distribution shape to change, e.g.:

random.my_dist(n, shape1, shape2)

So, this is going to generate a dirichlet under alpha = 1/ shape = 1. Or:

random.my_dist(n, alpha = 1)

Is that the desired outcome?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alpha = 1. See this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so this weight policy is being designed with a specific application in mind so that alpha doesn't need to be flexible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since we want uniform distribution. The alphas vectors are set to ones vector.

MatType weights = arma::randg<MatType>(numObjectives, numPoints,
arma::distr_param(1.0, 1.0)) + epsilon;
// Normalize each column.
return arma::normalise(weights, 1, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, thinking about this... We have a gamma distribution given as:

Y_i ~ gamma(alpha_i, 1)

From there, we're creating the normalization of:

X = sum_{i = 1}^{n} Y_i
V = (Y_1/X, Y_2/X, ... Y_n/X) ~ Dirchlet(alpha_1, ... alpha_n)

https://en.wikipedia.org/wiki/Dirichlet_distribution#Gamma_distribution

Looking at:

http://arma.sourceforge.net/docs.html#normalise

normalise is giving is a unit p-norm. With p = 1, we're getting:

X = sum abs(Y_i)

The abs() goes against the summation component, but gamma RVs will always be > 0.

So, long story short... We're doing one extra operation, but the end outcome will be the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think p=1 gives something like

std::sqrt( elem0^2 + elem1^2 + elem2^2 ..... elemn^2) and not the sum if I've gotten this correctly. I suppose you're saying, assuming X = sum abs(Y_i); abs part is unnecessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that would be p=2/ l2-norm. This is using an l1-norm by column.

p=1 gives:

std::pow( abs(elem0)^1 + abs(elem1)^1 + abs(elem2)^1 ..... abs(elemn)^1, 1.0);

or:

(abs(elem0)^1 + abs(elem1^1 + abs(elem2^1  + ..... + abs(elemn^1))^1 
= abs(elem0)^1 + abs(elem1)^1 + abs(elem2)^1  + ..... +  abs(elemn)^1 
= abs(elem0) + abs(elem1) + abs(elem2) + ..... + abs(elemn)

I think by-column is correct instead of by-row due to the weights being structured numObjectives x numPoints (norm across objectives under a single point)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've done it in a column basis so the normalization of each point is ensured. Just revised L-norms, I was mistaken about the L1 norm formula, thanks for pointing it out.

I think the "abs" part shouldn't really be that of a concern, either way, we know each element is > 0. We have one line of code which do the thing for us, and I really don't think the additional "abs" would severely hinder the performance.

@jonpsy
Copy link
Member Author

jonpsy commented Jun 18, 2021

Planning to add
using DirichletMOEAD = MOEAD<Dirichlet, Tchebycheff>

Add DirichletMOEAD
@jonpsy
Copy link
Member Author

jonpsy commented Jun 18, 2021

Right, let's test it one e.g. one ZDT problem.

I tested it against ZDT3 problem. Ran it over 1000 times using random seed using this bash command. It passed all the tests.

#!/bin/bash
for (( i = 0; i < 1000; i++))
do
 ./ensmallen_tests MOEADDIRICHLETZDT3Test --rng-seed time
done

@jonpsy
Copy link
Member Author

jonpsy commented Jun 21, 2021

I suppose it's fair to say this PR is GTM?

Copy link
Member

@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me.

Copy link

@mlpack-bot mlpack-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval provided automatically after 24 hours. 👍

@zoq
Copy link
Member

zoq commented Jun 23, 2021

Let's see if the build comes back green and merge it afterwards.

@zoq zoq merged commit 9232017 into mlpack:master Jun 23, 2021
@jonpsy jonpsy deleted the dirichlet branch June 23, 2021 19:54
@zoq
Copy link
Member

zoq commented Jun 23, 2021

Thanks 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants