Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe default for blocks_between_recompute at full precision #4109

Merged
merged 3 commits into from
Jul 12, 2022

Conversation

prckent
Copy link
Contributor

@prckent prckent commented Jul 12, 2022

Proposed changes

Change the default number of blocks between computing the wavefunction (primarily the inverse Slater matrices) for non-mixed precision builds to 10 from the current value of zero (never). Set the input parameter in a couple of tests for coverage.

QMCPACK should not have unsafe defaults, and this removes one. Closes #3056 . Currently if this input is not set, the results are guaranteed to be wrong after a sufficiently long run due to accumulation of numerical error. Unfortunately while the error clearly is larger in larger systems, how it accumulates is not obvious, particularly for different quality wavefunctions where e.g. occasional near electron coalescence presumably risks worsening the error. Potentially some of the reported problems with production runs could be due to this. While there are other more plausible reasons, this is an easy fix to one possibility. (Thanks to @jtkrogel for discussion on this issue.)

The new value of ten should be benign in terms of performance in nearly all circumstances while being numerically safe.
Experts can set a larger value or zero if they wish.

A future improvement could involve investigation of the error, checking the accumulated error in at least the inverses, or changing the default depending on electron count and steps per block. Ideally the code would check numerical consistency and restart a section or block if e.g. the local energy was too poor. The same analysis could be applied to mixed precision builds as well.

What type(s) of changes does this code introduce?

  • Bugfix
  • New feature
  • Testing changes (e.g. new unit/integration/performance tests)
  • Documentation changes

Does this introduce a breaking change?

  • No

What systems has this change been tested on?

gcc 12 mpi, nightly sulfur config.

Checklist

  • Yes. This PR is up to date with current the current state of 'develop'
  • Yes. Code added or changed in the PR has been clang-formatted
  • Yes. This PR adds tests to cover any new code, or to catch a bug that is being fixed
  • Yes. Documentation has been added (if appropriate)

from scratch: =1 by default when using mixed precision. =0 (no
recompute) by default when not using mixed precision. Recomputing
introduces a performance penalty dependent on system size.
- ``blocks_between_recompute`` Recompute the accuracy critical determinant part of the wavefunction from scratch: =1 by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple times of the same documentation. I think you only updated one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It will be a good day when we put the batched docs first and then again when the old docs can be removed.

Copy link
Contributor

@ye-luo ye-luo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ye-luo
Copy link
Contributor

ye-luo commented Jul 12, 2022

Test this please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change blocks_between_recompute default?
2 participants