-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use cholesky decomposition in correlated_values and correlated_values_norm #103
Conversation
I want to make this change, but I think this PR is too old to do so. I think it would be challenging to merge it in. I'll explore a new PR that lifts the main changes from this PR. @ces42 if you're still interested in this you should feel free to open a new PR also. |
@jagerber48 @ces42 I'm +1 on trying to resurrect this. |
I could try to make an analogous PR again. The main issue still is that numpy's Cholesky decomposition does not work in the semi-definite case, so some kind of fallback is necessary (and creating the fallback will probably be more work than the "general case"). I can see 3 options for the fallback:
It would be possible to detect whether scipy has been imported and only do 1. in that case. This might be considered a bad practice though... |
@newville
so I'm exploring knocking the Cholesky decomposition out first.
I like this option a lot. The semi-definite case would be something like you have a covariance matrix of random variables, but there is some basis in which one or more of the random variables has zero variance? It's a literal edge/degenerate case that hadn't occurred to me but I'm glad you point it out. Since it is an edge case I think it's no problem if users take a performance hit when they hit that case.
We're wringing our hands in some other threads about having
If it is slow what is the advantage of this approach over 2.? Maybe it's faster than 2.but slower than 1.? Seems like we could start with approach 2. then evolve to 3. if there's demand for more performant decompositions on these semi-definite cases.
I don't think this is a bad practice. It would be a little bit clunky but the logic for semi-definite matrices could be something like:
For positive-definite it would be
While the above is kind of appealing, I'm personally tempted to say just leave Be aware of this PR that is targeting these same functions and shows some of the discussion around optional dependencies. #268 |
This uses a Cholesky decomposition instead of diagonalization in
correlated_values
andcorrelated_values_norm
. The idea is the same as presented in #99.However, dealing with degenerate covariance matrices proved to be a bit tricky, since numpy refuses to do a Cholesky decomposition of a matrix that is only positive semi-definite. Therefore I added a function that performs a LDL decomposition in those cases (this would also be available in scipy).
Local tests (and theory) suggest that this should be more precise and faster than the old implementation.
Both
correlated_values
andcorrelated_values_norm
now raise an error if the covariance/correlation matrix fails to be (nearly) positive semi-definite.