Use cholesky decomposition in correlated_values and correlated_values_norm #103

ces42 · 2019-08-01T11:47:28Z

This uses a Cholesky decomposition instead of diagonalization in correlated_values and correlated_values_norm. The idea is the same as presented in #99.

However, dealing with degenerate covariance matrices proved to be a bit tricky, since numpy refuses to do a Cholesky decomposition of a matrix that is only positive semi-definite. Therefore I added a function that performs a LDL decomposition in those cases (this would also be available in scipy).

Local tests (and theory) suggest that this should be more precise and faster than the old implementation.

Both correlated_values and correlated_values_norm now raise an error if the covariance/correlation matrix fails to be (nearly) positive semi-definite.

jagerber48 · 2024-11-03T20:52:10Z

I want to make this change, but I think this PR is too old to do so. I think it would be challenging to merge it in. I'll explore a new PR that lifts the main changes from this PR. @ces42 if you're still interested in this you should feel free to open a new PR also.

newville · 2024-11-04T01:03:58Z

@jagerber48 @ces42 I'm +1 on trying to resurrect this.
I might say both that a) it is not the highest priority, and b) it should not have to wait another 5 years ;).

ces42 · 2024-11-04T01:13:10Z

I could try to make an analogous PR again. The main issue still is that numpy's Cholesky decomposition does not work in the semi-definite case, so some kind of fallback is necessary (and creating the fallback will probably be more work than the "general case"). I can see 3 options for the fallback:

Use the LDL decomposition from scipy. This would add scipy as a dependency.
Use diagonalization as in the current version of uncertainties. This is slow and numerically imprecise.
Hand-code a LDL (or similar) decomposition, as I ended up doing in this PR. This is slow and adds a lot of code.

It would be possible to detect whether scipy has been imported and only do 1. in that case. This might be considered a bad practice though...
Any thoughts on what approach might be best?

jagerber48 · 2024-11-04T06:00:25Z

I might say both that a) it is not the highest priority, and b) it should not have to wait another 5 years ;).

@newville
Yes, the reason these blocks of code are popping up is because I think they're going to need to be worked on for the large AffineScalarFunc refactor but I don't want to either

rewrite them using their same logic or
also load the transition to Cholesky decomposition into that PR

so I'm exploring knocking the Cholesky decomposition out first.

@ces42

Any thoughts on what approach might be best?

Use diagonalization as in the current version of uncertainties. This is slow and numerically imprecise.

I like this option a lot. The semi-definite case would be something like you have a covariance matrix of random variables, but there is some basis in which one or more of the random variables has zero variance? It's a literal edge/degenerate case that hadn't occurred to me but I'm glad you point it out. Since it is an edge case I think it's no problem if users take a performance hit when they hit that case.

Use the LDL decomposition from scipy. This would add scipy as a dependency.

We're wringing our hands in some other threads about having numpy as an optional or required dependency. I think appetite would be pretty low for adding scipy.

Hand-code a LDL (or similar) decomposition, as I ended up doing in this PR. This is slow and adds a lot of code.

If it is slow what is the advantage of this approach over 2.? Maybe it's faster than 2.but slower than 1.? Seems like we could start with approach 2. then evolve to 3. if there's demand for more performant decompositions on these semi-definite cases.

It would be possible to detect whether scipy has been imported and only do 1. in that case. This might be considered a bad practice though...

I don't think this is a bad practice. It would be a little bit clunky but the logic for semi-definite matrices could be something like:

exception if numpy is not importable
eigenvalue decomposition if numpy is importable but not scipy
scipy Cholesky decomposition if scipy is importable

For positive-definite it would be

exception if numpy is not importable
numpy Cholesky decomposition if numpy is importable

While the above is kind of appealing, I'm personally tempted to say just leave scipy out of it as a first pass. For years it's been using the slow numpy eigen decomposition and it's seemingly been ok for users. Users can always spin their own function using something else if they have a more performant way to do the decomposition and if that matters for them. We can always add the greedy (i.e. taking advantage of the scipy resource if available) performance enhancement down the road.

Be aware of this PR that is targeting these same functions and shows some of the discussion around optional dependencies. #268

jagerber48 · 2024-11-16T07:25:52Z

Moving forward on this at #271. @ces42 I'm curious if you have any comments on that PR.

ces42 added 9 commits July 9, 2019 20:26

Use Cholesky decomposition for correlated_values

b6165f7

use ldl decomposition for the degenerate case

e539c59

make this actually work

d4b9cb7

fix error in last iteration of ldl

7d6aec2

rename EPS to TOL

77c1037

docstrings and comments

e87eb43

allows passing an iterator as values_with_std_dev.

ae02984

change error handling in ldl a bit.

cf28d69

cosmetic

1c09804

ces42 mentioned this pull request Aug 1, 2019

Warn users who don't provide a real covariance matrix? #101

Open

wshanks force-pushed the master branch from 7950418 to 2738c32 Compare March 7, 2024 13:31

jagerber48 closed this Nov 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use cholesky decomposition in correlated_values and correlated_values_norm #103

Use cholesky decomposition in correlated_values and correlated_values_norm #103

ces42 commented Aug 1, 2019 •

edited

Loading

jagerber48 commented Nov 3, 2024

newville commented Nov 4, 2024

ces42 commented Nov 4, 2024 •

edited

Loading

jagerber48 commented Nov 4, 2024

jagerber48 commented Nov 16, 2024 •

edited

Loading

Use cholesky decomposition in correlated_values and correlated_values_norm #103

Use cholesky decomposition in correlated_values and correlated_values_norm #103

Conversation

ces42 commented Aug 1, 2019 • edited Loading

jagerber48 commented Nov 3, 2024

newville commented Nov 4, 2024

ces42 commented Nov 4, 2024 • edited Loading

jagerber48 commented Nov 4, 2024

jagerber48 commented Nov 16, 2024 • edited Loading

ces42 commented Aug 1, 2019 •

edited

Loading

ces42 commented Nov 4, 2024 •

edited

Loading

jagerber48 commented Nov 16, 2024 •

edited

Loading