Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix spline block offset caused by #4439 #4443

Merged
merged 1 commit into from
Feb 7, 2023

Conversation

ye-luo
Copy link
Contributor

@ye-luo ye-luo commented Feb 7, 2023

Proposed changes

Affecting >512 splines. Caught by NiO a64 in complex build. 384 complex orbitals, 768 splines.

https://cdash.qmcpack.org/CDash/testDetails.php?test=25095055&build=396449
The error message is invalid memory access but the root cause was bad orbital values caused NaN in electron positions.

What type(s) of changes does this code introduce?

  • Bugfix

Does this introduce a breaking change?

  • No

What systems has this change been tested on?

epyc-server

Checklist

  • Yes. This PR is up to date with current the current state of 'develop'

@ye-luo
Copy link
Contributor Author

ye-luo commented Feb 7, 2023

Test this please

@prckent
Copy link
Contributor

prckent commented Feb 7, 2023

The lack of unit tests for large electron counts/basis sizes is something that I have been concerned about. Can you expand the unit tests in test_MO.cpp or similar to exclude one with number of splines larger than all of the chunk sizes/teaming sizes? Say 997 splines or some slightly awkward prime? Such a test likely would have caught this issue.

@ye-luo
Copy link
Contributor Author

ye-luo commented Feb 7, 2023

test_MO.cpp is LCAO. The lack of faking spline orbitals is an issue.

@prckent
Copy link
Contributor

prckent commented Feb 7, 2023

I didn't find a test_spline equivalent to test_MO, although there are spline tests elsewhere. We could use fake spline data or even real data for a small system that is replicated so that there are sufficient orbitals. Both would have caught this problem since the expected orbital values would not have been obtained.

@prckent
Copy link
Contributor

prckent commented Feb 7, 2023

The LCAO code could have the same issues, either already or in the future.

@ye-luo
Copy link
Contributor Author

ye-luo commented Feb 7, 2023

I didn't find a test_spline equivalent to test_MO, although there are spline tests elsewhere. We could use fake spline data or even real data for a small system that is replicated so that there are sufficient orbitals. Both would have caught this problem since the expected orbital values would not have been obtained.

I don't have a way in mind to replicate small system orbitals to a large system. The current bug can be viewed as repeating orbitals beyond 512.

Fake data is more workable and it is also useful to create benchmarks rather than relying on large h5 files.

@prckent
Copy link
Contributor

prckent commented Feb 7, 2023

Fake data is more workable and it is also useful to create benchmarks rather than relying on large h5 files.

Agreed

@prckent prckent merged commit f437528 into QMCPACK:develop Feb 7, 2023
@ye-luo ye-luo deleted the fix-spline-blocking-offset branch February 13, 2023 23:53
@prckent prckent mentioned this pull request Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants