Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AuxReader for EDR Files #3749

Merged
merged 63 commits into from
Sep 19, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
f24d296
initial file created, test working on different computers
BFedder Jul 8, 2022
0906225
continued work
BFedder Jul 8, 2022
a7c1a81
Minimum working system: EDRReader
BFedder Jul 9, 2022
485b5d8
Fixed typos, PEP8
BFedder Jul 20, 2022
1048271
added pyedr to CI setup
BFedder Jul 20, 2022
0b49bad
EDRReader now fully functional in stand-alone use
BFedder Jul 20, 2022
e0767b4
Changed EDR so that base.get_description() works
BFedder Jul 20, 2022
0f7b1af
Implemented tests for standalone use
BFedder Jul 24, 2022
4f944b3
Added pyedr to azure-pipelines.yml
BFedder Jul 24, 2022
ca1a014
addressed orbeckst's reviews, fixed test for datafiles.py
BFedder Jul 28, 2022
54022bf
begin refactoring coordinates/base.py
BFedder Jul 29, 2022
a7f27b1
further work on handling energy data
BFedder Jul 29, 2022
ca39385
EDR Reader functionality complete
BFedder Jul 31, 2022
6159f2d
Added pyedr to maintainer/conda/environment.yml
BFedder Aug 3, 2022
f228aa7
PEP8 stuff
BFedder Aug 3, 2022
73143f6
PEP 8
BFedder Aug 3, 2022
3be01c6
improving coverage
BFedder Aug 3, 2022
46584ab
further coverage improvement
BFedder Aug 3, 2022
0733647
added function for returning data from auxreader
BFedder Aug 4, 2022
77d8fa1
modified CHANGELOG
BFedder Aug 4, 2022
3105f0e
Added type hinting
BFedder Aug 4, 2022
b2f30e3
added documentation
BFedder Aug 4, 2022
f48c74c
merged develop into edr_reader
BFedder Aug 4, 2022
1f921ea
Merge branch 'MDAnalysis-develop' into edr_reader
BFedder Aug 4, 2022
9152604
Use List, Dict from typing to fix CI
BFedder Aug 4, 2022
f59e6fa
Update package/MDAnalysis/auxiliary/EDR.py
BFedder Aug 4, 2022
b29747c
Fixed Docstring for EDRStep class
BFedder Aug 4, 2022
3916bdf
Merge branch 'edr_reader' of https://github.com/BFedder/mdanalysis in…
BFedder Aug 4, 2022
0092d34
pep8....
BFedder Aug 4, 2022
9235600
Merge branch 'MDAnalysis:develop' into edr_reader
BFedder Aug 5, 2022
08907c3
addressing some of IAlibay's comments
BFedder Aug 5, 2022
e78ab9e
Moved adding of aux data from coordinates/base to auxiliary/base
BFedder Aug 13, 2022
0119ee1
refined API overhaul and hopefully fixed CI, also now allowing None t…
BFedder Aug 14, 2022
c233970
Addressing reviewer comments
BFedder Aug 15, 2022
cccc1fd
Apply suggestions from code review
BFedder Aug 21, 2022
aa226ca
fixed test for get_data
BFedder Aug 21, 2022
d77517b
changed get_data docstring
BFedder Aug 21, 2022
fed3b06
added class attribute documentation to EDRReader
BFedder Aug 21, 2022
ed20993
added memory usage monitoring
BFedder Aug 21, 2022
cea6a97
Memory warning now reports memory usage
BFedder Aug 22, 2022
3660770
Warning message changed from KB to GB
BFedder Aug 22, 2022
1a0f4e3
fixed parentheses escape by using r-string
BFedder Aug 22, 2022
117d138
updated CHANGELOG, added test for NotImplemented, pickle, etc.
BFedder Aug 24, 2022
c7bde96
removed asterisk option from attach_auxiliary()
BFedder Sep 1, 2022
55a794a
Unit handling for EDRReaders
BFedder Sep 1, 2022
5e7a779
added TODO comments for issue 3811
BFedder Sep 1, 2022
dfaba25
Merge branch 'develop' into edr_reader
BFedder Sep 1, 2022
d1892f6
minor change to trigger GH Actions
BFedder Sep 2, 2022
c5a3a7c
added pyedr to requirements.txt and setup.py for cirrus
BFedder Sep 2, 2022
2cc8390
Merge branch 'develop' into edr_reader
orbeckst Sep 4, 2022
88c4651
Added 'convert_units' kwarg to EDRReader __init__
BFedder Sep 6, 2022
8b71850
Documentation overhaul
BFedder Sep 6, 2022
d599e97
Merge branch 'develop' into edr_reader
BFedder Sep 6, 2022
e8f5998
went from 4810 warnings to 237 in like 7 lines of code
BFedder Sep 6, 2022
336cc80
Merge branch 'edr_reader' of https://github.com/BFedder/mdanalysis in…
BFedder Sep 6, 2022
a62066e
Add test for MDANALYSIS_BASE_UNITS, some doc work
BFedder Sep 6, 2022
0ad0079
revert from ResourceWarning to UserWarning
BFedder Sep 6, 2022
064ab37
pep8 stuff
BFedder Sep 7, 2022
e256498
make pyedr optional and other IAlibay requests
BFedder Sep 12, 2022
8d9e077
PEP8
BFedder Sep 12, 2022
11c3924
changed changelog, docstring; used defaultdict
BFedder Sep 14, 2022
e3a2bb8
fixed CHANGELOG
BFedder Sep 19, 2022
828b2dc
Apply suggestions from code review
BFedder Sep 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 168 additions & 0 deletions package/MDAnalysis/auxiliary/EDR.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# -*- Mode: python; tab-width: 4; indent-tabs-mode:nil; coding:utf-8 -*-
# vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4
#
# MDAnalysis --- https://www.mdanalysis.org
# Copyright (c) 2006-2017 The MDAnalysis Development Team and contributors
# (see the file AUTHORS for the full list of names)
#
# Released under the GNU Public Licence, v2 or any higher version
#
# Please cite your use of MDAnalysis in published work:
#
# R. J. Gowers, M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler,
# D. L. Dotson, J. Domanski, S. Buchoux, I. M. Kenney, and O. Beckstein.
# MDAnalysis: A Python package for the rapid analysis of molecular dynamics
# simulations. In S. Benthall and S. Rostrup editors, Proceedings of the 15th
# Python in Science Conference, pages 102-109, Austin, TX, 2016. SciPy.
# doi: 10.25080/majora-629e541a-00e
#
# N. Michaud-Agrawal, E. J. Denning, T. B. Woolf, and O. Beckstein.
# MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations.
# J. Comput. Chem. 32 (2011), 2319--2327, doi:10.1002/jcc.21787
#

"""
EDR auxiliary reader --- :mod:`MDAnalysis.auxiliary.EDR`
========================================================

EDR files are binary files following the XDR protocol. They are written by
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
GROMACS during simulations and contain the time-series energy data of the
system.

panedr is a Python package ( https://github.com/mdanalysis/panedr ) that reads
these binary files and returns them human-readable form, either as a Pandas
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
DataFrame or as a dictionary of NumPy arrays. It is used by the EDR auxiliary
reader to parse EDR files. As such, a dictionary with string keys and numpy
array values is loaded into the EDRReader.

The EDR auxiliary reader takes the output from panedr and loads the energy data
as auxiliary data into Universes. Standalone usage is also possible, where the
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
energy terms are extracted without associating them with the trajectory, for
example, to allow easy plotting of the energy terms.


hmacdope marked this conversation as resolved.
Show resolved Hide resolved

"""
import numbers
import os
import numpy as np
from . import base
from ..lib.util import anyopen
import panedrlite as panedr


class EDRStep(base.AuxStep):
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
""" AuxStep class for .edr file format.

Extends the base AuxStep class to allow selection of time and
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
data-of-interest fields (by column index) from the full set of data read
each step.

Parameters
----------
time_selector : int | None, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
time_selector : int | None, optional
time_selector : str, optional

From the __init__ signature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes... Need to change the docstring here in general still, thanks!

Index of column in .xvg file storing time, assumed to be in ps. Default
value is 0 (i.e. first column).
data_selector : list of int | None, optional
List of indices of columns in .xvg file containing data of interest to
be stored in ``data``. Default value is ``None``.
**kwargs
Other AuxStep options.

See Also
--------
:class:`~MDAnalysis.auxiliary.base.AuxStep`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd omit ~ so that it's clear that you're looking at the base.AuxStep

"""

def __init__(self, time_selector="Time", data_selector=None, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should go ahead and add type hints here as you go along


super(EDRStep, self).__init__(time_selector=time_selector,
data_selector=data_selector,
**kwargs)

def _select_time(self, key):
return self._select_data(key)
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
IAlibay marked this conversation as resolved.
Show resolved Hide resolved
if key is None:
# here so that None is a valid value; just return
return
if isinstance(key, numbers.Integral):
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
return self._select_data(key)
else:
raise ValueError('Time selector must be single index')

def _select_data(self, key):
return self._data[key]
if key is None:
# here so that None is a valid value; just return
return
if isinstance(key, numbers.Integral):
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
try:
return self._data[key]
except IndexError:
errmsg = (f'{key} not a valid index for data with '
f'{len(self._data)} columns')
raise ValueError(errmsg) from None
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
else:
return np.array([self._select_data(i) for i in key])


class EDRReader(base.AuxReader):
""" Auxiliary reader to read data from a .edr file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know I've barely been able to say words today, a .edr file or an .edr file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say "from a .edr file", but "from an EDR file", but that's because I say "dot edr" in my mind

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an edr file for mine sorry to be a PITA

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, here and everywhere else, should we just be using EDR_ instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'm not sure the entry in the GROMACS manual is interesting or helpful enough to keep linking to it every time the format is mentioned


Detault reader for .edr files. All data from the file will be read and
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
stored on initialisation.

Parameters
----------
filename : str
Location of the file containing the auxiliary data.
**kwargs
Other AuxReader options.

See Also
--------
:class:`~MDAnalysis.auxiliary.base.AuxReader`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use the ~


Note
----
The file is assumed to be of a size such that reading and storing the full
hmacdope marked this conversation as resolved.
Show resolved Hide resolved
contents is practical.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate based on the memory warning stuff

"""

format = "EDR"
_Auxstep = EDRStep

def __init__(self, filename, **kwargs):
self._auxfile = os.path.abspath(filename)
self.auxdata = panedr.edr_to_dict(filename)
self._n_steps = len(self.auxdata["Time"])
# attribute to communicate found energy terms to user
self.terms = [key for key in self.auxdata.keys()]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.terms = [key for key in self.auxdata.keys()]
self.terms = list(self.auxdata.keys())

I think this will give you the same thing and resolve ~ 200 ns faster on my laptop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's much better, thank you!

super(EDRReader, self).__init__(**kwargs)

def _read_next_step(self):

""" Read next auxiliary step and update ``auxstep``.

Returns
-------
AuxStep object
Updated with the data for the new step.

Raises
------
StopIteration
When end of auxiliary data set is reached.
"""
auxstep = self.auxstep
new_step = self.step + 1
if new_step < self.n_steps:
auxstep._data = {term: self.auxdata[term][self.step] for term in self.terms}
print(auxstep._data)
auxstep.step = new_step
return auxstep
else:
self.rewind()
raise StopIteration


1 change: 1 addition & 0 deletions package/MDAnalysis/auxiliary/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -523,4 +523,5 @@

from . import base
from . import XVG
from . import EDR
from .core import auxreader