Safe default for blocks_between_recompute at full precision #4109
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed changes
Change the default number of blocks between computing the wavefunction (primarily the inverse Slater matrices) for non-mixed precision builds to 10 from the current value of zero (never). Set the input parameter in a couple of tests for coverage.
QMCPACK should not have unsafe defaults, and this removes one. Closes #3056 . Currently if this input is not set, the results are guaranteed to be wrong after a sufficiently long run due to accumulation of numerical error. Unfortunately while the error clearly is larger in larger systems, how it accumulates is not obvious, particularly for different quality wavefunctions where e.g. occasional near electron coalescence presumably risks worsening the error. Potentially some of the reported problems with production runs could be due to this. While there are other more plausible reasons, this is an easy fix to one possibility. (Thanks to @jtkrogel for discussion on this issue.)
The new value of ten should be benign in terms of performance in nearly all circumstances while being numerically safe.
Experts can set a larger value or zero if they wish.
A future improvement could involve investigation of the error, checking the accumulated error in at least the inverses, or changing the default depending on electron count and steps per block. Ideally the code would check numerical consistency and restart a section or block if e.g. the local energy was too poor. The same analysis could be applied to mixed precision builds as well.
What type(s) of changes does this code introduce?
Does this introduce a breaking change?
What systems has this change been tested on?
gcc 12 mpi, nightly sulfur config.
Checklist