Unusual behavior of "OneShiftOnly" and comparison of other optimizers #3968

aannabe · 2022-04-22T19:38:09Z

Describe the bug
An unusual behavior was observed for the Tb atom using OneShiftOnly optimizer. See the comparison of optimizers for energy and variance.

To Reproduce

Built from a recent commit 0ac62ea
Compile script is attached.

System:

system name: NERSC/Cori-KNL
modules loaded: See attached file

Additional context
All optimizers use the same parameters:

                linopt1 = linear(
                    energy               = 0.9,
                    unreweightedvariance = 0.0,
                    reweightedvariance   = 0.1,
                    samples              = int(1e6),
                    substeps             = 3,
                    warmupSteps          = 20,
                    blocks               = 50,
                    nonlocalpp           = True,
                    usedrift             = True,
                    minmethod            = optimizer,
                    minwalkers           = 0.1,
                    timestep             = 0.5,
                    )

Other parameters are default ones.

Using the same resources, the execution times are given below:

Optimizer	Execution Time (secs)
quartic		1641
apative		895.4
OneShiftOnly	359
descent		269.5

optimizers_compare.zip

The text was updated successfully, but these errors were encountered:

ye-luo · 2022-04-22T19:53:13Z

Could you enable use_nonlocalpp_deriv?

prckent · 2022-04-22T20:06:10Z

Isn't this on by default now? ( I am amazed that the optimizer was as successful as it was without them )

ye-luo · 2022-04-22T20:17:32Z

This system has NonLocalECP/LocalEnergy ~ 27% in VMC. that is why I feel use_nonlocalpp_deriv can be critical.
It is still not default. The WF derivative support is still sparse see 2b) #3789

ye-luo · 2022-04-22T20:41:35Z

Nlpp derivatives seem to be the key, on my workstation

optJ12_OneShiftOnly_nlpp$ qmca -q ev *.scalar.dat
                            LocalEnergy               Variance           ratio 
optJ12  series 0  -126.302186 +/- 0.005828   13.094399 +/- 0.021925   0.1037 
optJ12  series 1  -126.847775 +/- 0.002220   3.525128 +/- 0.012072   0.0278 
optJ12  series 2  -126.873861 +/- 0.002854   4.278754 +/- 0.014905   0.0337 
optJ12  series 3  -126.885645 +/- 0.003060   4.437665 +/- 0.024933   0.0350 
optJ12  series 4  -126.886982 +/- 0.002496   4.461602 +/- 0.016027   0.0352 
optJ12  series 5  -126.884445 +/- 0.003074   4.459609 +/- 0.016952   0.0351 
optJ12  series 6  -126.884534 +/- 0.003664   4.484875 +/- 0.019814   0.0353 
optJ12  series 7  -126.886260 +/- 0.003808   4.563254 +/- 0.035779   0.0360 
optJ12  series 8  -126.887999 +/- 0.003236   4.482411 +/- 0.015329   0.0353 
optJ12  series 9  -126.886232 +/- 0.002063   4.547143 +/- 0.023041   0.0358

aannabe · 2022-04-22T20:53:18Z

Below are with derivatives. Indeed it fixes the issue.

The old/new execution times:

Optimizer	Time_No_Deriv (secs)	Time_with_Deriv (secs)	Increase(%)
quartic		1641			1653			0.7
apative		895.4			991.9			10.7
OneShiftOnly	359			510.7			42.3
descent		269.5			359.1			33.2

I agree with having use_nonlocalpp_deriv on by default.

ye-luo · 2022-04-22T21:01:06Z

Your curves confirm my expectation.

quartic has cost function mixing energy and variance. That is why it has the lowest variance but not energy.
oneshift/adaptive/descent do energy minimization only. Thus they converge to the same energy and variance. Energy is lower than quartic but the variance is higher than quartic.

aannabe · 2022-04-22T21:16:41Z

I wasn't aware that only quartic supports the mixing of energy and variance. I think this is not mentioned in the documentation.
For instance, this block in the documentation gives the wrong impression that OneShiftOnly will do mixed-cost optimization:

https://github.com/QMCPACK/qmcpack/blob/develop/docs/methods.rst#:~:text=%3Cloop%20max%3D%2210,parameter%3E%0A%20%20%20...%0A%20%3C/qmc%3E%0A%3C/loop%3E

prckent · 2022-04-22T21:42:17Z

Hmm. I know there are some gaps, but even going back to the 2014 and 2016 workshop it was assumed that arbitrary mixes of energy and variance were supported by all, or nearly all, of the optimizers.

I think we need to get more serious on this topic and should at least issue a warning if the cost function is not pure energy for optimizers that don't support this. ( Similarly #3969 should have warnings where there are gaps in implementation )

ye-luo · 2022-04-22T21:50:21Z

Users rarely read QMCPACK warning unless a run breaks down. Documentation likely more.
I kind of doubt issuing a warning really helps anything but implementing a warning cost developer cycles.

jtkrogel · 2022-04-25T18:37:22Z

IMO optimizers that do not use <cost/> should abort if it is provided. This will prevent false conclusions from being made about the relationship between the inputs and outputs.

jtkrogel · 2022-04-25T18:39:47Z

@aannabe thanks for reporting the unusual behavior (both w.r.t. optimization performance and input inconsistencies). This kind of information is quite valuable and not enough people take the time to report.

jtkrogel · 2022-04-25T18:54:00Z

A follow-on question: why is OneShiftOnly so much more sensitive to the exclusion of the nlpp derivative data than the other optimizers? Reduced robustness in one context raises questions about others.

ye-luo · 2022-04-25T19:05:05Z

@jtkrogel

If cost doesn't apply to a optimization method, the code should stop it. OneShiftOnly should abort when cost function is provided #1494
timing. Need to see full details. The full mpirun line. Input, output file. Need to scrutinize the timing data before discussing why.

ye-luo mentioned this issue Apr 22, 2022

Turn on NLPP energy derivatives by default. #3969

Merged

ye-luo closed this as completed in #3969 Apr 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unusual behavior of "OneShiftOnly" and comparison of other optimizers #3968

Unusual behavior of "OneShiftOnly" and comparison of other optimizers #3968

aannabe commented Apr 22, 2022

ye-luo commented Apr 22, 2022

prckent commented Apr 22, 2022

ye-luo commented Apr 22, 2022

ye-luo commented Apr 22, 2022 •

edited

Loading

aannabe commented Apr 22, 2022

ye-luo commented Apr 22, 2022 •

edited

Loading

aannabe commented Apr 22, 2022

prckent commented Apr 22, 2022

ye-luo commented Apr 22, 2022 •

edited

Loading

jtkrogel commented Apr 25, 2022

jtkrogel commented Apr 25, 2022

jtkrogel commented Apr 25, 2022

ye-luo commented Apr 25, 2022 •

edited

Loading

Unusual behavior of "OneShiftOnly" and comparison of other optimizers #3968

Unusual behavior of "OneShiftOnly" and comparison of other optimizers #3968

Comments

aannabe commented Apr 22, 2022

ye-luo commented Apr 22, 2022

prckent commented Apr 22, 2022

ye-luo commented Apr 22, 2022

ye-luo commented Apr 22, 2022 • edited Loading

aannabe commented Apr 22, 2022

ye-luo commented Apr 22, 2022 • edited Loading

aannabe commented Apr 22, 2022

prckent commented Apr 22, 2022

ye-luo commented Apr 22, 2022 • edited Loading

jtkrogel commented Apr 25, 2022

jtkrogel commented Apr 25, 2022

jtkrogel commented Apr 25, 2022

ye-luo commented Apr 25, 2022 • edited Loading

ye-luo commented Apr 22, 2022 •

edited

Loading

ye-luo commented Apr 22, 2022 •

edited

Loading

ye-luo commented Apr 22, 2022 •

edited

Loading

ye-luo commented Apr 25, 2022 •

edited

Loading