-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maximum Likelihood #69
Conversation
rct_optimizations/include/rct_optimizations/maximum_likelihood.h
Outdated
Show resolved
Hide resolved
2c2562a
to
b0c99b1
Compare
Eigen::Map<const ArrayXX> param_array(parameter[0], mean.rows(), mean.cols()); | ||
|
||
Eigen::Map<ArrayXX> residual_array(residual, mean.rows(), mean.cols()); | ||
residual_array = (param_array - mean.cast<T>()) / stdev.cast<T>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marip8 I believe this is missing a sqrt(2) in the denominator. Also from a nomenclature stand point, I don't think this is strictly a maximum likelihood estimate. It certainly adds a cost to straying from the prior. It is however related to the log of the likelihood. It would have made more sense here to use: 1/(sigmasqrt(2pi)) * e^((-param[i]-mean[i])^2/(2sigma^2)).
By taking the log, one simplifies the math. The minimum of the log function and the minimum of the likelihood function occur at the same point. However, the cost value is scaled. If you care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably deserves a little more attention in the future regarding the naming because it is not maximum likelihood estimation, but certainly attempts to maximize the likelihood of the parameter w.r.t the input mean and standard deviation.
I got the math for this from the linked example in the Ceres repository. I was surprised that it was done this way, but it seems to work better than using the Gaussian distribution and log likelihood equations. When I used those equations in the cost function (modified slightly since Ceres does residual minimization rather than maximization), the optimization in the unit test still converged but it wouldn't drive the variables to the mean very well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marip8 You can't use the likelihood itself as a cost because you want to maximize likelihood, not minimize a cost. You could create a cost that is the difference in likelihood at the prior, and the current estimate's likelihood. The only thing questionable about what's here is that its units are not likelihood, but its named maximum likelihood. Still if this is the only cost, then it results in a maximum likelihood given the prior meaning it returns the prior(parameter.x = x_bar), the most likely value given your prior knowledge about the variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't use the likelihood itself as a cost because you want to maximize likelihood, not minimize a cost. You could create a cost that is the difference in likelihood at the prior, and the current estimate's likelihood.
Sure, that's what I meant by (modified slightly since Ceres does residual minimization rather than maximization)
. I implemented this difference strategy with both likelihood and log likelihood, and Ceres was not able to drive the values sufficiently close to the mean
The only thing questionable about what's here is that its units are not likelihood, but its named maximum likelihood.
I'm open to naming suggestions, but haven't come up with anything that really conveys the purpose of this cost. Maybe LikelihoodCost
?
This PR adds a template cost function for computing the likelihood of a parameter given its expected mean and standard deviation. The cost function is based on this example from Ceres