Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Cholesky for covariance handling #408

Merged

Conversation

veni-vidi-vici-dormivi
Copy link
Collaborator

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi commented Feb 28, 2024

Closes #402 and #168

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi changed the title Test if drwaing realizations is close Test if drawing realizations is close Feb 28, 2024
Copy link

codecov bot commented Feb 28, 2024

Codecov Report

Attention: Patch coverage is 90.90909% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 88.01%. Comparing base (9b0b76b) to head (c5fb678).
Report is 17 commits behind head on main.

❗ Current head c5fb678 differs from pull request most recent head a486a8e. Consider uploading reports for the commit a486a8e to get more accurate results

Files Patch % Lines
mesmer/stats/_auto_regression.py 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #408      +/-   ##
==========================================
+ Coverage   87.90%   88.01%   +0.10%     
==========================================
  Files          40       40              
  Lines        1745     1760      +15     
==========================================
+ Hits         1534     1549      +15     
  Misses        211      211              
Flag Coverage Δ
unittests 88.01% <90.90%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mathause mathause mentioned this pull request Feb 28, 2024
@mathause
Copy link
Member

Interesting 😉 can you merge main (after #409). If you add ${{ matrix.os }} to the name on L16 we'll directly see which OS fails.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

I wanted to test with a new dataset but can't save a new one, see: pydata/xarray#8743 .

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

veni-vidi-vici-dormivi commented Mar 1, 2024

Well now I am really confused 😅
Edit: Context - the tests on the dataset I created on MacOs failed on Ubuntu and MacOs but succeeded on windows.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

macOS-latest is macOS 12 and I created the dataset on macOS 14 so let's try that.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Well would you look at that. The dataset from macos 14 succeeds on macOS 14 and windows latest.

@mathause
Copy link
Member

mathause commented Mar 4, 2024

Well would you look at that. The dataset from macos 14 succeeds on macOS 14 and windows latest.

Interesting that macOS-latest does not seem to be the latest macOS 🤷‍♂️

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Yes, but they are open about it:
From https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources:
Note: The -latest runner images are the latest stable images that GitHub provides, and might not be the most recent version of the operating system available from the operating system vendor.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

It bugs me that I cannot reproduce the large difference of the datasets we get between ubuntu and macOS. The test passed locally when I test the macOS dataset on atmos (openSUSE). GitHub Actions doesn't provide openSuse but one can set up self hosted runners, see here. However, this is not recommended for public repositories for obvious reasons. Do we have an ubuntu machine where we could test where things are trailing off?

Here are some observations I have made:

  • The test passes on Mac and atmos locally with data generated on Mac
  • Between atmos and Mac I find differences for all emulation steps:
    • maximum differences in the global variability emulations are on the order of e-16
    • maximum differences in the local trend response are on the order of e-15
    • maximum differences in the local variability are on the order of e-13
    • maximum differences in the resulting emulations are on the order of e-13
  • looking at the difference between the inverted matrices on the two systems (using numpy.linalg.inv and scipy.linalg.inv) I find maximum differences of e-14. The maximum difference between the numpy and scipy methods is also e-14.
  • The params of all emulation steps are equal
  • The fact we see small differences already for the global variability points to the fact that it is something about the random number generator as well not just the matrix inversion. Interestingly however, is that the differences in global variability (and subsequently local trend emulations) are just happening sometimes, the majority of the array is exactly equal. For the inverted matrices, the local variability emulations and the result, the whole array is slightly off.

All of this did not get me much closer to understand why the difference between the result on ubuntu and Mac is up to 0.8.

@mathause
Copy link
Member

mathause commented Mar 4, 2024

I tried some more stuff:

  • on cfc and on my laptop I get large differences
  • on argon I do not
  • on argon and cfc I use the same environment

I checked np.show_config(). What I saw it that argon has more simd extensions, which are maybe used for matrix inversions. But this is a wild guess at the moment.

``

  • AVX512F
  • AVX512CD
  • AVX512_SKX
  • AVX512_CLX
  • AVX512_CNL
  • AVX512_ICL
</details>





@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Okay but that's something. Then I'll try to pin it down with output from cfc. Thanks!

@mathause
Copy link
Member

mathause commented Mar 4, 2024

In #168 (comment) I found that small differences (1e-15), can lead to one large difference for the "inverted" matrices (0.1), which lead to differences ind the samples.

Yes, using cfc can definitely get you faster feedback.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Okay, saving A that is used to draw from a multivariate makes the emulations pass on different OS. However, this just shifts our problem to testing the calibration where we save A. Now this will not be the same between different OS but one could argue this is less important to the user since the calibration can always be exported.

A second issue here is that I was not actually able to reproduce the behavior of numpy.random.multivariate_normal, even though I think I copied the exact Code they use, but maybe I am wrong?

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Uff sorry the commits are a mess here, actually "fix dimensions of innovations" also includes switching from the numpy multivariate normal to the scipy multivariate normal.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

So this looks pretty good! I see two thing to do now:

  1. test this approach with actual emulations and see if they are plausible
  2. also implement cholesky in the localized covariance module, but maybe in another PR? but then we would want to check again the emulations...

Copy link
Member

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Can you

  • rename the PR
  • open a issue that we re-consider the try ... except np.linalg.LinAlgError approach? (potentially require method="eigh")
  1. test this approach with actual emulations and see if they are plausible
  2. also implement cholesky in the localized covariance module, but maybe in another PR? but then we would want to check again the emulations...

Yes and yes. Maybe do 1. after 2.?

.github/workflows/ci-workflow.yml Outdated Show resolved Hide resolved
mesmer/stats/_auto_regression.py Outdated Show resolved Hide resolved
tests/integration/test_draw_realisations_from_bundle.py Outdated Show resolved Hide resolved
mesmer/stats/_auto_regression.py Show resolved Hide resolved
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
@veni-vidi-vici-dormivi veni-vidi-vici-dormivi changed the title Test if drawing realizations is close Stabilize drawing emulations Mar 8, 2024
Copy link
Member

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll assume the changelog entry is for the coming PR?

edit: sorry my bad, all good

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Happy to announce that the emulations look good and are approximately 10 times faster (for my small profiling script)! 😊😊

@mathause
Copy link
Member

Hurray 😄 Did you test the training and the drawing?

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Did you test the training and the drawing?

Yes, so calibrating + emulating together is 10 times faster. Drawing from the autocorrelated process is 10 times faster and finding the localized covariance is 22 times faster. find_localized_empirical_covariance takes the most time in both cases.

@mathause
Copy link
Member

awesome!

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi changed the title Stabilize drawing emulations Implement Cholesky for covariance handling Mar 12, 2024
@veni-vidi-vici-dormivi veni-vidi-vici-dormivi merged commit 79c8ee0 into MESMER-group:main Mar 12, 2024
8 checks passed
@veni-vidi-vici-dormivi veni-vidi-vici-dormivi deleted the test_draw_close branch March 12, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

why are emulations not stable?
2 participants