-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Dirichlet Weight Initialization #296
Conversation
tests are passing
/** | ||
* The Dirichlet method for initializing weights. Sampling a dirichlet distribution | ||
* with parameters set to one returns points lying on unit simplex with uniform | ||
* distribution. | ||
* | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zoq Not sure what should be the source ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is one we should reference here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for the policy as well?
You have any specifics in mind? |
We can just use the exciting one but change the policy. |
I could test it against a difficult ZDT problem. See how it fares? The paper has tried testing against some ZDT problems. We can see how much the theoretical results match. |
Right, let's test it one e.g. one ZDT problem. |
const size_t numPoints, | ||
const double epsilon) | ||
{ | ||
MatType weights = arma::randg<MatType>(numObjectives, numPoints, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generation functions usually allow for the underlying distribution shape to change, e.g.:
random.my_dist(n, shape1, shape2)
So, this is going to generate a dirichlet under alpha = 1
/ shape = 1. Or:
random.my_dist(n, alpha = 1)
Is that the desired outcome?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alpha = 1. See this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so this weight policy is being designed with a specific application in mind so that alpha
doesn't need to be flexible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since we want uniform distribution. The alphas vectors are set to ones vector.
MatType weights = arma::randg<MatType>(numObjectives, numPoints, | ||
arma::distr_param(1.0, 1.0)) + epsilon; | ||
// Normalize each column. | ||
return arma::normalise(weights, 1, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, thinking about this... We have a gamma distribution given as:
Y_i ~ gamma(alpha_i, 1)
From there, we're creating the normalization of:
X = sum_{i = 1}^{n} Y_i
V = (Y_1/X, Y_2/X, ... Y_n/X) ~ Dirchlet(alpha_1, ... alpha_n)
https://en.wikipedia.org/wiki/Dirichlet_distribution#Gamma_distribution
Looking at:
http://arma.sourceforge.net/docs.html#normalise
normalise
is giving is a unit p-norm. With p = 1
, we're getting:
X = sum abs(Y_i)
The abs()
goes against the summation component, but gamma RVs will always be > 0.
So, long story short... We're doing one extra operation, but the end outcome will be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think p=1
gives something like
std::sqrt( elem0^2 + elem1^2 + elem2^2 ..... elemn^2)
and not the sum if I've gotten this correctly. I suppose you're saying, assuming X = sum abs(Y_i); abs part is unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, that would be p=2
/ l2-norm. This is using an l1-norm by column.
p=1
gives:
std::pow( abs(elem0)^1 + abs(elem1)^1 + abs(elem2)^1 ..... abs(elemn)^1, 1.0);
or:
(abs(elem0)^1 + abs(elem1^1 + abs(elem2^1 + ..... + abs(elemn^1))^1
= abs(elem0)^1 + abs(elem1)^1 + abs(elem2)^1 + ..... + abs(elemn)^1
= abs(elem0) + abs(elem1) + abs(elem2) + ..... + abs(elemn)
I think by-column is correct instead of by-row due to the weights being structured numObjectives x numPoints
(norm across objectives under a single point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've done it in a column basis so the normalization of each point is ensured. Just revised L-norms, I was mistaken about the L1 norm formula, thanks for pointing it out.
I think the "abs" part shouldn't really be that of a concern, either way, we know each element is > 0. We have one line of code which do the thing for us, and I really don't think the additional "abs" would severely hinder the performance.
Planning to add |
Add DirichletMOEAD
I tested it against ZDT3 problem. Ran it over 1000 times using random seed using this bash command. It passed all the tests. #!/bin/bash
for (( i = 0; i < 1000; i++))
do
./ensmallen_tests MOEADDIRICHLETZDT3Test --rng-seed time
done |
I suppose it's fair to say this PR is GTM? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second approval provided automatically after 24 hours. 👍
Let's see if the build comes back green and merge it afterwards. |
Thanks 👍 |
Continuing #293
using DirichletMOEAD = MOEAD<Dirichlet, Tchebycheff>