Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ForceBalance takes really long to complete #275

Open
rvkrishnan30 opened this issue Nov 18, 2022 · 2 comments
Open

ForceBalance takes really long to complete #275

rvkrishnan30 opened this issue Nov 18, 2022 · 2 comments

Comments

@rvkrishnan30
Copy link

Hi,

I am running ForceBalance to generate custom parameters for a specific torsion and for reasons not completely clear to me, it seems to take a really long time (2 or more days).

As I bumped into this issue around the same time as the Openff-bespokefit workshop, I had a chat with them (openforcefield/openff-bespokefit#200) and tested by loosening the convergence criteria to see if that helped. In the test case with S1A ligand (image below with torsion of interest highlighted in red) it did speed up things without affecting the accuracy (from ~40 hrs to ~19 hrs). However, I did not see much success in run time with another molecule (unfortunately, not public yet but is not a macrocycle), where it came down from 22.2 to 21.2 hrs.
image

This table below summarizes the change in convergence criteria, the speed improved seen and the K values along with the resultant torsion plots.
image
image

Below is my example input file passed to ForceBalance.

$options
ffdir forcefield
penalty_type L1
jobtype optimize
forcefield force-field.offxml

maxstep 10
convergence_step 1.00
convergence_objective 1.00
convergence_gradient 0.01
criteria 2
eig_lowerbound 0.01
finite_difference_h 0.01
penalty_additive 1.0

trust0 -0.25
mintrust 0.05
error_tolerance 1.0
adaptive_factor 0.2
adaptive_damping 1.0
normalize_weights False
constrain_charge false

priors
ProperTorsions/Proper/k : 6.0
/priors

$end

$target
name ligand_fragment_around_22_24
weight 1.0

type TorsionProfile_SMIRNOFF
mol2 ligand_fragment_around_22_24.sdf
pdb ligand_fragment_around_22_24.pdb
coords scan.xyz
writelevel 2
attenuate 1

energy_denom 1.0
energy_upper 10.0
$end

I suspect something is taking much longer than it has to slowing down the entire FB fitting process. Could you please suggest a better way to debug and narrow down to what's causing the issue, and also possibly what other parameters can be tuned to avoid this behaviour. I am going through the ForceBalance documentation too trying to figure out what could further help me.

Additional information in case needed:
I am running ForceBalance version 1.9.2 on a Debian machine.

Please let me know if you need any additional information and I'll be happy to help.

Thank you
Venkat

@leeping
Copy link
Owner

leeping commented Nov 21, 2022

Hello Venkat,

Thanks for your message. As I understand it, your calculation is taking longer than 20 hours to parameterize a torsion for a single molecule? That is definitely longer than expected.

From looking at your input file, I can see that you've set the max. number of optimization cycles to 10. However, the number of objective function evaluations might be higher because FB will carry out "micro-iterations" for each optimization cycle if trust0 is set to a negative number. This is supposed to be beneficial to calculations where computing the gradient of the objective function is expensive compared to single-point evaluations, which is often the case if there are many parameters being optimized. In your case, the number of parameters might be small (<10) which suggests you could get a performance boost by making tmax into a positive number. However, each individual step will not be as good as when a negative value of tmax is used, so I suggest that you compare the performance between negative and positive tmax and increase maxiter if needed for the positive tmax.

The other thing that's possibly happening is that the OpenMM energy minimization is taking longer than expected. The time it takes to do an energy minimization will affect the time it takes to evaluate the objective function, because the total number of optimizations is (n_grid_points * n_ff_parameters * 2) for each of your molecules. You can add some timing codes into openmmio.py or smirnoffio.py to record the time it spends to perform each energy minimization as part of the whole procedure and draw a histogram of the run times. It would be good to know how long an individual OpenMM energy minimization takes a long time on average, or if it takes a very long time for a minority of cases. (I think for a small organic molecule in the gas phase it should take <0.1 seconds). If you find that these are taking a long time, then we might want to see how we can improve the performance of the OpenMM energy minimizer.

Thanks,

  • Lee-Ping

@rvkrishnan30
Copy link
Author

Thank you @leeping , That's helpful. As suggested I'll run some tests by changing the settings (trust0 and maxiter) and see how it affects the performance. And then also insert a few timing codes to find out where during MM it spends most of the time.

If it helps, attached are the files:
S1A_venkat_Nov2022.tar.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants