Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sofk estimator test and manual entry about how estimator works #764

Merged
merged 8 commits into from
Dec 18, 2018

Conversation

Paul-St-Young
Copy link
Contributor

@Paul-St-Young Paul-St-Young commented Apr 12, 2018

This PR follows up on #756 but only introduce a test and manual updates

Documenting estimator manager has turned out to be a lot more involved than I expected.
I decided to upload the test examples first to get the discussion started.

The new tests can be ran using ctest -R estimator-sofk. I added two SkEstimator estimators per run. One outputs to scalar.dat while the other outputs to stat.h5. DMC is done using a very large time step and 1 step per block. This is a severe test of the ensemble average procedure because both walker weight and walker number undergo large fluctuation.

In the Python script check_collectables_h5dat, I get the S(k) value at the largest k vector and compare between scalar.dat and stat.h5. S(k->large) should fluctuation around 1.0 in both cases. Currently, scalar.dat fluctuates around 1.0 but stat.h5 does not.

sofk_dat-h5

add tests to check consistency between scalar.dat and stat.h5 output
paths
@ghost ghost assigned Paul-St-Young Apr 12, 2018
@ghost ghost added the in progress label Apr 12, 2018
@qmc-robot
Copy link

Can one of the maintainers verify this patch?

1 similar comment
@qmc-robot
Copy link

Can one of the maintainers verify this patch?

@Paul-St-Young
Copy link
Contributor Author

@markdewing is there an easy way to pass extra arguments to my Python script using the CMake macro SIMPLE_RUN_AND_CHECK?

I currently copied and pasted the same script for pbyp and allp, changing the "prefix" variable only.

@Paul-St-Young
Copy link
Contributor Author

Regarding the actual bug. I think Collectables is zeroed out at every step, but it may not be populated if all moves are rejected in advanceWalkers.

@markdewing
Copy link
Contributor

okay to test

@markdewing
Copy link
Contributor

@Paul-St-Young It should be straightforward to add arguments with some changes to the SIMPLE_RUN_AND_CHECK function. Use the ${ARGN} variable to get extra arguments to that function, and pass it the add_test function after ${check_cmd}

It think it should be enough to change the add_test call to

add_test(NAME "${test_name}"
    COMMAND "${check_cmd}" ${ARGN}
    WORKING_DIRECTORY "${work_dir}")

WIth this, any extra arguments given to SIMPLE_RUN_AND_CHECK will be passed as extra arguments to the check script

@Paul-St-Young
Copy link
Contributor Author

Thanks @markdewing! You 1-line solution worked beautifully.

@Paul-St-Young
Copy link
Contributor Author

I added documentation on the current estimator collection procedure. I think there are improvements to be made, but my understanding is that @ye-luo is already working on it.

I will work to fix the current bug within the existing design.

\subsection{Class Relations}
A large number of classes are involved in the estimator collection process. They often have misleading class name or method name. Document gotchas in the following list:
\begin{enumerate}
\item \verb|EstimatorManager| is an unused copy of \verb|EstimatorManagerBase|. \verb|EstimatorManagerBase| is the class used in the QMC drivers. (PR \#371 explains this)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can EstimatorManager.cpp and .h be deleted? (See #371). That would address this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not do the deletion of the file but just don't mention it in the tex. I will soon PR some changes including the deletion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will take out this item from the doc


Two ensemble-level data structures \verb|ParticleSet::Properties| and \verb|::Collectables| serve as intermediaries between evaluate classes and output classes to scalar.dat and stat.h5. \verb|Properties| appears in both scalar.dat and stat.h5, whereas \verb|Collectables| appears only in stat.h5. \verb|Properties| is overwritten by \verb|QMCHamiltonian::Observables| at the end of each step. \verb|QMCHamiltonian::Observables| is filled upon call to \verb|QMCHamiltonian::evaluate| and \verb|::auxHevaluate|. \verb|Collectables| is zeroed at the beginning of each step and accumulated upon call to \verb|::auxHevaluate|.

Data are outputted to scalar.dat in 4 stages: evaluate, load, unload, and collect. In the evaluate stage, \verb|QMCHamiltonian::Observables| is populated by a list of \verb|QMCHamiltonianBase|. In the load stage, \verb|QMCHamiltonian::Observables| is transfered to \verb|Properties| by \verb|QMCDriver|. In the unload stage, \verb|Properties| is copied to \verb|LocalEnergyEstimator::scalars|. In the collect stage, \verb|LocalEnergyEstimator::scalars| is block-averaged to \verb|EstimatorManagerBase|\\ \verb|::AverageCache| and dumped to file. For \verb|Collectables|, the evaluate and load stages are combined in a call to \verb|QMCHamiltonian::auxHevaluate|. In the unload stage, \verb|Collectables| is copied to \verb|CollectablesEstimator::scalars|. In the collect stage, \verb|CollectablesEstimator|\\ \verb|::scalars| is block-averaged to \verb|EstimatorManagerBase::AverageCache| and dumped to file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph should go before the detailed description of each stage, as an overview.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I wrote this summary last, but it would serve as a good overview.

\begin{enumerate}
\item \verb|EstimatorManager| is an unused copy of \verb|EstimatorManagerBase|. \verb|EstimatorManagerBase| is the class used in the QMC drivers. (PR \#371 explains this)
\item \verb|EstimatorManagerBase::Estimators| is completely different from \verb|QMCDriver::Estimators|, which is subtly different from \verb|QMCHamiltonianBase::Estimators|. The first is a list of pointers to \verb|ScalarEstimatorBase|. The second is the master estimator (one per MPI group). The third is the slave estimator that exists one per OpenMP thread.
\item \verb|QMCHamiltonian| is NOT a parent class of \verb|QMCHamiltonianBase|. Instead, \verb|QMCHamiltonian| owns two lists of \verb|QMCHamiltonianBase| named \verb|H| and \verb|auxH|.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the names, I would (incorrectly) expect QMCHamiltonian to be derived from QMCHamiltonianBase. It would be good to say what these are (or if they are defined in a previous section). QMCHamiltonianBase is the base class for individual Hamiltonian pieces, and QMCHamiltonian is the container for all of them. H is for Hamiltonian elements that contribute to the energy, and auxH is for elements that do not. (Someone else can comment if this description is wrong?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. A "HamiltonianBase" element is really a component of the Hamiltonian for ones that contribute to the energy and just an observable/operator otherwise. The common denominator for HamiltonianBase is that each derived instance represents a quantum mechanical observable/operator.

\item \verb|EstimatorManager| is an unused copy of \verb|EstimatorManagerBase|. \verb|EstimatorManagerBase| is the class used in the QMC drivers. (PR \#371 explains this)
\item \verb|EstimatorManagerBase::Estimators| is completely different from \verb|QMCDriver::Estimators|, which is subtly different from \verb|QMCHamiltonianBase::Estimators|. The first is a list of pointers to \verb|ScalarEstimatorBase|. The second is the master estimator (one per MPI group). The third is the slave estimator that exists one per OpenMP thread.
\item \verb|QMCHamiltonian| is NOT a parent class of \verb|QMCHamiltonianBase|. Instead, \verb|QMCHamiltonian| owns two lists of \verb|QMCHamiltonianBase| named \verb|H| and \verb|auxH|.
\item \verb|QMCDriver::H| is NOT the same as \verb|QMCHamiltonian::H|. The first is a pointer to a \verb|QMCHamiltonian|. \verb|QMCHamiltonian::H| is a list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could subsumed into the previous item by pointing out the QMCDriver::H is of type QMCHamiltonian (and QMCHamiltonian::H was defined in that point)

A large number of classes are involved in the estimator collection process. They often have misleading class name or method name. Document gotchas in the following list:
\begin{enumerate}
\item \verb|EstimatorManager| is an unused copy of \verb|EstimatorManagerBase|. \verb|EstimatorManagerBase| is the class used in the QMC drivers. (PR \#371 explains this)
\item \verb|EstimatorManagerBase::Estimators| is completely different from \verb|QMCDriver::Estimators|, which is subtly different from \verb|QMCHamiltonianBase::Estimators|. The first is a list of pointers to \verb|ScalarEstimatorBase|. The second is the master estimator (one per MPI group). The third is the slave estimator that exists one per OpenMP thread.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like one big difference between these is related to parallelization - MPI and OpenMP? It seems like it would be helpful to describe how the parallelization structure drives some of structure of the estimator accumulation. Also, how does this structure affect the weighted summation in E_VMC and E_DMC?


In a DMC simulation, the weight of each walker is different and may change from step to step. Further, the ensemble size varies from step to step. Therefore, eq.~(\ref{eq:estimator}) simplifies to
\begin{align}
E_{DMC}[O] = \frac{1}{N_{step}} \sum_{s} \left\{ \left(\sum_e w_{s,e} O(\bs{R}_{s,e}) \right) / \left( \sum \limits_{e} w_{s,e} \right) \right\}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do the differences in the scalar.dat and stat.h5 come about? Are there differences in how this weighted average gets grouped and implemented?

The change from E[O] to E_DMC[O] looks like an approximation, not an identity (?) To be completely correct, there should be a weight for each ensemble average as well ( ensemble weight = sum_e w_{s,e} / sum of all weights). The bias will be reduced if the ensemble weights do not vary much (limit of large ensemble or small timesteps).

@prckent
Copy link
Contributor

prckent commented Nov 27, 2018

This PR has not been updated in some while. There are a few outstanding questions that need answering before merging and some conflicts to fix. Should we leave this PR open?

@ghost ghost assigned ye-luo Nov 27, 2018
@ye-luo
Copy link
Contributor

ye-luo commented Nov 27, 2018

I fixed the merge conflict. This PR seems not solving our issue but mostly documentation, it can be merged.

Copy link
Contributor

@ye-luo ye-luo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to have this PR merged. We will revise the manual entry as we refactor estimator manager.

@ye-luo ye-luo changed the title [WIP] make averages of Collectables and Properties consistent Add sofk estimator test and manual entry about how estimator works Dec 17, 2018
@qmc-robot
Copy link

Can one of the admins verify this patch?

@ye-luo ye-luo merged commit ca320f4 into QMCPACK:develop Dec 18, 2018
@ghost ghost removed the in progress label Dec 18, 2018
@Paul-St-Young Paul-St-Young deleted the dat-h5 branch October 26, 2019 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants