Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VirtualSites are silently dropped from itp #1

Open
fgrunewald opened this issue Sep 8, 2020 · 4 comments
Open

VirtualSites are silently dropped from itp #1

fgrunewald opened this issue Sep 8, 2020 · 4 comments

Comments

@fgrunewald
Copy link

fgrunewald commented Sep 8, 2020

Hi,

First of all I want to say that this is a very interesting project. I'm very much looking forward to using the program in some high throughout parametrization of models. However, it seems that the program fails to include virtual sites. I assume they are dropped because they cannot be optimized. However, the program also does not issue a warning or error message at the pre process stage.

From a martini centric perspective it would be very useful to have the code simply keep virtual sites, because a lot of martini3 models are using virtual sites. Can this be implemented in the current code?

EDIT: Actually after hacking in VS treating them like exclusions, I figured out that this will not be possible to have because MDAnalysis make whole does not support making whole molecules that have VS (i.e. not contiguous bonds).

@CharlyEmpereurmot
Copy link
Collaborator

Hello,

Thank you for your interest in this project. It is still in its early days and we have not extensively tested all use cases, notably with virtual sites. Also refactoring is on-going, which we expect will ease the inclusion of more potential functions + usage with different MD engines.

In principle there is no reason VS cannot be optimized for bonded. One of the interesting aspect of Swarm-CG is that it relies on FST-PSO to optimize as many objects (bonds/angles/dihedrals) as the topology defines, while still offering some guarantee about the structure of the CG model, even for big molecules with nested topologies.

We will look into usage with VS very soon and include this in the next release. According to what you are saying, I think we need to read the ITP to find bonds involving VS and create these bonds in the MDAnalysis universe if they do not exist, before calling make_whole.

Could you please attach here the input files you are using ? Or send via email at charly.empereur@gmail.com. You could make the trajectory ~100 frames if the files are too heavy, it's just for testing.

@fgrunewald
Copy link
Author

Hi,

According to what you are saying, I think we need to read the ITP to find bonds involving VS and create these bonds in the MDAnalysis universe if they do not exist, before calling make_whole.<

Pretty much yes. Essentially VS are constructed by geometric rules from other atoms. So in that sense they are not bonded by constraints or bonds to any other atom. They cheap option would then indeed be to put "dummy" bonds into the MDAnalysis universe and then call make whole. On the other hand the cleaner option would be to fix this in MDAnalysis. It shouldn't be too hard.

Regarding the files, I'll prepare an example setup for you of a similar system. I cannot share the exact input parameters because they are currently in the process of publication.

@CharlyEmpereurmot
Copy link
Collaborator

CharlyEmpereurmot commented Oct 5, 2020

Hello,

So it should be working fine now, starting from version 1.1.7 all virtual sites functions are properly handled. We also added the option to interpret a mapping file using either center of mass (COM) or center of geometry (COG). The latter seems to be increasingly used in the context of MARTINI 3. You can upgrade to version >= 1.1.7 from pypi.

Please let us know if it works for you !
And don't hesitate if you find any strange behavior, or if you think about new useful features :)

@fgrunewald
Copy link
Author

fgrunewald commented Oct 29, 2020

Hi,

Thanks for dealing with this issue. It took me a while to circle back to this, but I just tested the recent version 1.1.8 and I still find a couple of issues:

  • the virtual-site bead types need to start with 'V' according to the error that is raised when they are not. However, in the code they are required to be 'v'. In general it is not a good idea at all to relay on bead-types for this kind of thing, because they frequently change. In particular with Martini3 the virtual-site is called U. If I were you I would look into using a more robust itp parser.
  • One option for more robust IO of itps would be to use vermouth with SwarmCG. https://github.com/marrink-lab/vermouth-martinize . We have extensively tested both writing and reading itp files at the CG/AA level. From what I read see in yout code this could fairly easily be done, although it is quite a rewrite. However, I'm sure that it would make your parsing more future proof and robust. In any case if you want, we (i.e. marrink-lab) are here to help you with that.
  • After fixing the V vs V issue, I ran into an error regarding the mapping file. It appears VS do not need to be specified in the mapping file? Is that right? Anyways, if I delete them they are read without issue.
  • After dealing with that issue: I get to an issue on this line:
    for bead_id in range(len(ns.cg_itp['atoms'])):

    here the iterator goes over all atoms and later VS are excluded by bead_type name. However, the atomgroup dict does only have indices for all 'real' beads. If you test this with a single virtual-site and that happens to be at the end of your molecule, this doesn't become a problem. However, for me as I have more than one I directly got into trouble.
    I propose to replace it with:
    for bead_id in range(len(ns.real_beads_ids)):
    If I do that it seems to work. If you want I can make a PR with the changes I made to make the VS work with more than 1.
  • Last but not least I seem to have issues with convergence and monitoring. I'll make a separate issue from that. However, one thing I can already tell is that the log file needs to be written with a certain frequency otherwise the run is detected as being failed. It would be good to point this out somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants