Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize FileIO and TextIOWrapper and Universe #2723

Merged
merged 120 commits into from
Aug 8, 2020
Merged
Show file tree
Hide file tree
Changes from 113 commits
Commits
Show all changes
120 commits
Select commit Hold shift + click to select a range
931d9b5
Merge pull request #1 from MDAnalysis/develop
yuxuanzhuang May 20, 2020
d69ce98
Merge remote-tracking branch 'mda_origin/develop' into develop
yuxuanzhuang Jun 9, 2020
432edf3
add pickle function to fileio, textio
yuxuanzhuang Jun 9, 2020
bf55bf3
add basic test for pickle io
yuxuanzhuang Jun 9, 2020
cd9a485
add comments
yuxuanzhuang Jun 9, 2020
f33629f
xfail on python2
yuxuanzhuang Jun 10, 2020
aca4496
add doc and exception for pickle_open
yuxuanzhuang Jun 11, 2020
cd4ffe3
add doc for textio fileio class
yuxuanzhuang Jun 11, 2020
ec5bd3c
add parallel test for textio
yuxuanzhuang Jun 11, 2020
f6515ee
pep8
yuxuanzhuang Jun 11, 2020
cf29764
add an extra bufferlayer for FileIO for fast access
yuxuanzhuang Jun 11, 2020
6b09e20
Merge remote-tracking branch 'mda_origin/develop' into serialize_io
yuxuanzhuang Jun 11, 2020
c29c316
ditch py2
yuxuanzhuang Jun 11, 2020
db47e27
pep8 and doc
yuxuanzhuang Jun 11, 2020
7e9d6d3
add test for unsupported mode
yuxuanzhuang Jun 11, 2020
eb83a7c
pep8
yuxuanzhuang Jun 11, 2020
0baa868
typo
yuxuanzhuang Jun 12, 2020
c8e63b2
pickle reorder
yuxuanzhuang Jun 12, 2020
658b446
pickle_open as context manager
yuxuanzhuang Jun 12, 2020
1aa6003
format
yuxuanzhuang Jun 12, 2020
f2738bc
move pickle-io to a separate file
yuxuanzhuang Jun 12, 2020
001f3b8
doc
yuxuanzhuang Jun 12, 2020
d0374f5
FileIOPicklable class now only supports name as input, (preventing us…
yuxuanzhuang Jun 12, 2020
8c62df8
pickle open doc and add fspath for the filename
yuxuanzhuang Jun 12, 2020
94f1f8d
absolute import
yuxuanzhuang Jun 12, 2020
acbadec
more doc
yuxuanzhuang Jun 13, 2020
7043d2d
more pep8 and format
yuxuanzhuang Jun 15, 2020
546b05d
Merge remote-tracking branch 'mda_origin/develop' into serialize_io
yuxuanzhuang Jun 15, 2020
c259143
sphinx mark up
yuxuanzhuang Jun 15, 2020
a016a65
add pickle_open example
yuxuanzhuang Jun 15, 2020
401e6ae
changelog
yuxuanzhuang Jun 15, 2020
9225c71
sphinx more
yuxuanzhuang Jun 15, 2020
46c43af
add context manager approach text
yuxuanzhuang Jun 15, 2020
21fe5aa
add match for test valueerror
yuxuanzhuang Jun 15, 2020
821c822
typo
yuxuanzhuang Jun 15, 2020
5e25380
Merge branch 'develop' into serialize_io
yuxuanzhuang Jun 15, 2020
2541a3e
tell error and fileio cov
yuxuanzhuang Jun 19, 2020
1003cd3
Merge branch 'serialize_io' of https://github.com/yuxuanzhuang/mdanal…
yuxuanzhuang Jun 19, 2020
1b7a798
Merge branch 'develop' into serialize_io
yuxuanzhuang Jun 19, 2020
b79d282
remove future import
yuxuanzhuang Jun 19, 2020
d909d63
merge to develop
yuxuanzhuang Jun 19, 2020
cafc596
sphinx block code
yuxuanzhuang Jun 19, 2020
2db1ef2
typo
yuxuanzhuang Jun 20, 2020
33ef68a
Merge branch 'develop' into serialize_io
yuxuanzhuang Jun 21, 2020
24f2a34
Merge branch 'develop' into serialize_io
yuxuanzhuang Jun 22, 2020
84baca9
pickle open pdb and xyz
yuxuanzhuang Jun 23, 2020
108ebde
Merge branch 'serialize_io' of https://github.com/yuxuanzhuang/mdanal…
yuxuanzhuang Jun 23, 2020
7cb40ad
add pickle support to universe, add test, add chainreader
yuxuanzhuang Jun 24, 2020
352ab96
fix misc issues
yuxuanzhuang Jun 24, 2020
356986f
remove python2 legacy bz2
yuxuanzhuang Jun 24, 2020
e5ef732
remove fail test for offset
yuxuanzhuang Jun 24, 2020
aa6e40d
issue raised in changelog
yuxuanzhuang Jun 24, 2020
43a62d5
pep8
yuxuanzhuang Jun 24, 2020
2559625
add pickle func to ReaderBase and set offset
yuxuanzhuang Jun 26, 2020
507f8f5
add test for bz2 gzip and class check
yuxuanzhuang Jun 26, 2020
26fcfe9
add test for gsd, ncdf
yuxuanzhuang Jun 26, 2020
405a6dc
add test for trajectory.next after pickling
yuxuanzhuang Jun 26, 2020
2380a47
older gsd file
yuxuanzhuang Jun 26, 2020
dab38c1
move gsd, ncdf to coord
yuxuanzhuang Jun 29, 2020
5c07901
add chainreader state
yuxuanzhuang Jun 29, 2020
b324791
test timestep
yuxuanzhuang Jun 29, 2020
49f959d
doc
yuxuanzhuang Jun 29, 2020
773524d
add doc version change
yuxuanzhuang Jun 30, 2020
b7e4ef0
chainreader fix
yuxuanzhuang Jun 30, 2020
9d376b7
docstring error
yuxuanzhuang Jun 30, 2020
11cceb4
check dt before pickle
yuxuanzhuang Jul 1, 2020
a3130f5
add pickle test to base
yuxuanzhuang Jul 3, 2020
faf1e01
Merge branch 'develop' into serialize_io
yuxuanzhuang Jul 3, 2020
df7eb86
add chemfiles pickle
yuxuanzhuang Jul 4, 2020
72ba276
doc
yuxuanzhuang Jul 4, 2020
aa62ff0
doc add note
yuxuanzhuang Jul 5, 2020
04be63d
merge to develop
yuxuanzhuang Jul 5, 2020
f01769f
merge to develop
yuxuanzhuang Jul 5, 2020
5a2b28d
change chain getstate
yuxuanzhuang Jul 5, 2020
b5f5270
add in-line comments
yuxuanzhuang Jul 5, 2020
e1facfb
pep8
yuxuanzhuang Jul 5, 2020
cba4456
add chemfile test
yuxuanzhuang Jul 6, 2020
5622b51
pep8
yuxuanzhuang Jul 6, 2020
46cda48
raise error with mode
yuxuanzhuang Jul 7, 2020
5e2ee79
change to read_step
yuxuanzhuang Jul 8, 2020
b23b2fb
change to almost_equal
yuxuanzhuang Jul 8, 2020
cd03058
save frame
yuxuanzhuang Jul 8, 2020
3ce8ba7
save frame pep
yuxuanzhuang Jul 8, 2020
a5da2f7
add doc for pickle
yuxuanzhuang Jul 8, 2020
5a9ad4d
timestep pickle doc
yuxuanzhuang Jul 8, 2020
bc60aa7
doc serialize
yuxuanzhuang Jul 9, 2020
01fc644
doc sphinx
yuxuanzhuang Jul 10, 2020
84eb61f
pickle u with getsetstate
yuxuanzhuang Jul 10, 2020
9f18ccd
pep
yuxuanzhuang Jul 10, 2020
8d07004
Merge branch 'develop' into serialize_io
yuxuanzhuang Jul 10, 2020
e37c84a
warning on cfg
yuxuanzhuang Jul 10, 2020
2d3de99
sep files
yuxuanzhuang Jul 13, 2020
67b65d1
merge to develop
yuxuanzhuang Jul 13, 2020
18d146b
sep to two files
yuxuanzhuang Jul 13, 2020
8679e50
fixed failed merge in CHANGELOG
orbeckst Jul 14, 2020
0ceffe5
removed superfluous blank lines from CHANGELOG
orbeckst Jul 14, 2020
688041c
xdr dcd seek error
yuxuanzhuang Jul 16, 2020
204545b
Merge branch 'serialize_io' of https://github.com/yuxuanzhuang/mdanal…
yuxuanzhuang Jul 16, 2020
3c71f8a
Merge remote-tracking branch 'mda_origin/develop' into serialize_io
yuxuanzhuang Jul 16, 2020
f2239bb
current frame xdr/dcd
yuxuanzhuang Jul 16, 2020
78c93a0
Merge branch 'develop' into serialize_io
orbeckst Jul 17, 2020
c0d241e
remove tests not needed
yuxuanzhuang Jul 19, 2020
68b1c2a
pep
yuxuanzhuang Jul 19, 2020
4061434
Merge branch 'develop' into serialize_io
yuxuanzhuang Jul 20, 2020
d457491
test title more accurate
yuxuanzhuang Jul 20, 2020
4c70dcb
Merge remote-tracking branch 'mda_origin/develop' into serialize_io
yuxuanzhuang Jul 27, 2020
0496ca1
misc
yuxuanzhuang Jul 27, 2020
df061fc
gsd dim
yuxuanzhuang Jul 28, 2020
abe92da
add test for runtimee pickle
yuxuanzhuang Jul 29, 2020
b3469fe
add test for runtimee pickle
yuxuanzhuang Jul 29, 2020
b12eb0d
pep
yuxuanzhuang Jul 29, 2020
fae4797
doc pickle_reader
yuxuanzhuang Jul 29, 2020
52a981e
mock chemfiles
yuxuanzhuang Jul 30, 2020
c4ec287
chemfiles mock when not found
yuxuanzhuang Aug 1, 2020
8804e5b
doc revised
yuxuanzhuang Aug 3, 2020
c99867f
add pickle test to single_framereader
yuxuanzhuang Aug 3, 2020
a70bc8b
add pickle test to fhiams
yuxuanzhuang Aug 3, 2020
bc487a5
test doc
yuxuanzhuang Aug 6, 2020
a1bb47e
test doc title
yuxuanzhuang Aug 6, 2020
5ace1e0
test doc title 2
yuxuanzhuang Aug 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ Enhancements
* Added converter between Cartesian and Bond-Angle-Torsion coordinates (PR #2668)
* Added Hydrogen Bond Lifetime via existing autocorrelation features (PR #2791)
* Added Hydrogen Bond Lifetime keyword "between" (PR #2791)
* Added lib.pickle_file_io module for pickling file handlers. (PR #2723)
* Added pickle function to `Universe` and all Readers (without transformation)
(PR #2723)
* Dead code removed from the TPR parser and increased test coverage (PR #2840)
* TPR parser exposes the elements topology attribute (PR #2858, see Issue #2553)

Expand All @@ -77,6 +80,7 @@ Changes
* Removes deprecated ProgressMeter (Issue #2739)
* Removes deprecated MDAnalysis.units.N_Avogadro (PR #2737)
* Dropped Python 2 support
* Set Python 3.6 as the minimum supported version (Issue #2541)
* Changes the minimal NumPy version to 1.16.0 (Issue #2827, PR #2831)
* Sets the minimal RDKit version for CI to 2020.03.1 (Issue #2827, PR #2831)
* Removes deprecated waterdynamics.HydrogenBondLifetimes (PR #2842)
Expand Down
3 changes: 2 additions & 1 deletion package/MDAnalysis/coordinates/DLPoly.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@

from . import base
from . import core
from ..lib import util

_DLPOLY_UNITS = {'length': 'Angstrom', 'velocity': 'Angstrom/ps', 'time': 'ps'}

Expand Down Expand Up @@ -149,7 +150,7 @@ def __init__(self, filename, **kwargs):
super(HistoryReader, self).__init__(filename, **kwargs)

# "private" file handle
self._file = open(self.filename, 'r')
self._file = util.anyopen(self.filename, 'r')
self.title = self._file.readline().strip()
self._levcfg, self._imcon, self.n_atoms = np.int64(self._file.readline().split()[:3])
self._has_vels = True if self._levcfg > 0 else False
Expand Down
154 changes: 145 additions & 9 deletions package/MDAnalysis/coordinates/GSD.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,20 @@
.. autoclass:: GSDReader
:inherited-members:

.. autoclass:: GSDPicklable
:members:

.. autofunction:: gsd_pickle_open

"""
import numpy as np
import os
import gsd
import gsd.fl
import gsd.hoomd

from . import base


class GSDReader(base.ReaderBase):
"""Reader for the GSD format.

Expand All @@ -69,6 +76,10 @@ def __init__(self, filename, **kwargs):


.. versionadded:: 0.17.0
.. versionchanged:: 2.0.0
Now use a picklable :class:`gsd.hoomd.HOOMDTrajectory`--
:class:`GSDPicklable`

"""
super(GSDReader, self).__init__(filename, **kwargs)
self.filename = filename
Expand All @@ -77,10 +88,10 @@ def __init__(self, filename, **kwargs):
self.ts = self._Timestep(self.n_atoms, **self._ts_kwargs)
self._read_next_timestep()

def open_trajectory(self) :
def open_trajectory(self):
"""opens the trajectory file using gsd.hoomd module"""
self._frame = -1
self._file = gsd.hoomd.open(self.filename,mode='rb')
self._file = gsd_pickle_open(self.filename, mode='rb')

def close(self):
"""close reader"""
Expand All @@ -97,7 +108,7 @@ def _reopen(self):
self.open_trajectory()

def _read_frame(self, frame):
try :
try:
myframe = self._file[frame]
except IndexError:
raise IOError from None
Expand All @@ -111,20 +122,145 @@ def _read_frame(self, frame):

# set frame box dimensions
self.ts.dimensions = myframe.configuration.box
for i in range(3,6) :
self.ts.dimensions[i] = np.arccos(self.ts.dimensions[i]) * 180.0 / np.pi
self.ts.dimensions[3:] = np.rad2deg(np.arccos(self.ts.dimensions[3:]))

# set particle positions
frame_positions = myframe.particles.position
n_atoms_now = frame_positions.shape[0]
if n_atoms_now != self.n_atoms :
if n_atoms_now != self.n_atoms:
raise ValueError("Frame %d has %d atoms but the initial frame has %d"
" atoms. MDAnalysis in unable to deal with variable"
" topology!"%(frame, n_atoms_now, self.n_atoms))
else :
else:
self.ts.positions = frame_positions
return self.ts

def _read_next_timestep(self) :
def _read_next_timestep(self):
"""read next frame in trajectory"""
return self._read_frame(self._frame + 1)


class GSDPicklable(gsd.hoomd.HOOMDTrajectory):
"""Hoomd GSD file object (read-only) that can be pickled.

This class provides a file-like object (as by :func:`gsd.hoomd.open`,
namely :class:`gsd.hoodm.HOOMDTrajectory`) that, unlike file objects,
can be pickled. Only read mode is supported.

When the file is pickled, filename and mode of :class:`gsd.fl.GSDFile` in
the file are saved. On unpickling, the file is opened by filename.
This means that for a successful unpickle, the original file still has to
be accessible with its filename.

Note
----
Open hoomd GSD files with `gsd_pickle_open`.
After pickling, the current frame is reset. `universe.trajectory[i]` has
to be used to return to its original frame.

Parameters
----------
file: :class:`gsd.fl.GSDFile`
File to access.

Example
-------
::

gsdfileobj = gsd.fl.open(name=filename,
mode='rb',
application='gsd.hoomd '+gsd.__version__,
schema='hoomd',
schema_version=[1, 3])
file = GSDPicklable(gsdfileobj)
file_pickled = pickle.loads(pickle.dumps(file))

See Also
---------
:func:`MDAnalysis.lib.picklable_file_io.FileIOPicklable`
:func:`MDAnalysis.lib.picklable_file_io.BufferIOPicklable`
:func:`MDAnalysis.lib.picklable_file_io.TextIOPicklable`
:func:`MDAnalysis.lib.picklable_file_io.GzipPicklable`
:func:`MDAnalysis.lib.picklable_file_io.BZ2Picklable`


.. versionadded:: 2.0.0
"""
def __getstate__(self):
return self.file.name, self.file.mode

def __setstate__(self, args):
gsd_version = gsd.__version__
schema_version = [1, 4] if gsd_version >= '1.9.0' else [1, 3]
gsdfileobj = gsd.fl.open(name=args[0],
mode=args[1],
application='gsd.hoomd ' + gsd_version,
schema='hoomd',
schema_version=schema_version)
self.__init__(gsdfileobj)


def gsd_pickle_open(name, mode='rb'):
"""Open hoomd schema GSD file with pickle function implemented.

This function returns a GSDPicklable object. It can be used as a
context manager, and replace the built-in :func:`gsd.hoomd.open` function
in read mode that only returns an unpicklable file object.

Schema version will depend on the version of gsd module.

Note
----
Can be only used with read mode.

Parameters
----------
name : str
a filename given a text or byte string.
mode: {'r', 'rb'} (optional)
'r', 'rb': open for reading;

Returns
-------
stream-like object: GSDPicklable

Raises
------
ValueError
if `mode` is not one of the allowed read modes

Examples
-------
open as context manager::

with gsd_pickle_open('filename') as f:
line = f.readline()

open as function::

f = gsd_pickle_open('filename')
line = f.readline()
f.close()

See Also
--------
:func:`MDAnalysis.lib.util.anyopen`
:func:`MDAnalysis.lib.picklable_file_io.pickle_open`
:func:`MDAnalysis.lib.picklable_file_io.bz2_pickle_open`
:func:`MDAnalysis.lib.picklable_file_io.gzip_pickle_open`
:func:`gsd.hoomd.open`


.. versionadded:: 2.0.0
"""
gsd_version = gsd.__version__
schema_version = [1, 4] if gsd_version >= '1.9.0' else [1, 3]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something for me to do in a follow-up, but we can avoid some headaches long-term by using the PEP440 implementation of version comparisons here and elsewhere.

Those are robust to alpha/beta/rc/dev versions and so on. The raw string comparison is much less robust; there's a sort of middle ground available from distutils for version checking as well, though the PEP one is better these days.

Probably out of scope given the size of this PR already though.

if mode not in {'r', 'rb'}:
raise ValueError("Only read mode ('r', 'rb') "
"files can be pickled.")
gsdfileobj = gsd.fl.open(name=name,
mode=mode,
application='gsd.hoomd ' + gsd_version,
schema='hoomd',
schema_version=schema_version)
return GSDPicklable(gsdfileobj)
63 changes: 60 additions & 3 deletions package/MDAnalysis/coordinates/TRJ.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@
.. autoclass:: NCDFWriter
:members:

.. autoclass:: NCDFPicklable
:members:

.. _ascii-trajectories:

Expand Down Expand Up @@ -158,7 +160,6 @@
import MDAnalysis
from . import base
from ..lib import util

logger = logging.getLogger("MDAnalysis.coordinates.AMBER")


Expand Down Expand Up @@ -450,6 +451,9 @@ class NCDFReader(base.ReaderBase):
.. versionchanged:: 1.0.0
Support for reading `degrees` units for `cell_angles` has now been
removed (Issue #2327)
.. versionchanged:: 2.0.0
Now use a picklable :class:`scipy.io.netcdf.netcdf_file`--
:class:`NCDFPicklable`.

"""

Expand All @@ -469,8 +473,8 @@ def __init__(self, filename, n_atoms=None, mmap=None, **kwargs):

super(NCDFReader, self).__init__(filename, **kwargs)

self.trjfile = scipy.io.netcdf.netcdf_file(self.filename,
mmap=self._mmap)
self.trjfile = NCDFPicklable(self.filename,
mmap=self._mmap)

# AMBER NetCDF files should always have a convention
try:
Expand Down Expand Up @@ -1075,3 +1079,56 @@ def close(self):
if self.trjfile is not None:
self.trjfile.close()
self.trjfile = None


class NCDFPicklable(scipy.io.netcdf.netcdf_file):
"""NetCDF file object (read-only) that can be pickled.

This class provides a file-like object (as returned by
:class:`scipy.io.netcdf.netcdf_file`) that,
unlike standard Python file objects,
can be pickled. Only read mode is supported.

When the file is pickled, filename and mmap of the open file handle in
the file are saved. On unpickling, the file is opened by filename,
and the mmap file is loaded.
This means that for a successful unpickle, the original file still has to
be accessible with its filename.

Parameters
----------
filename : str or file-like
a filename given a text or byte string.
mmap : None or bool, optional
Whether to mmap `filename` when reading. True when `filename`
is a file name, False when `filename` is a file-like object.

Example
-------
::

f = NCDFPicklable(NCDF)
print(f.variables['coordinates'].data)
f.close()

can also be used as context manager::

with NCDFPicklable(NCDF) as f:
print(f.variables['coordinates'].data)

See Also
---------
:class:`MDAnalysis.lib.picklable_file_io.FileIOPicklable`
:class:`MDAnalysis.lib.picklable_file_io.BufferIOPicklable`
:class:`MDAnalysis.lib.picklable_file_io.TextIOPicklable`
:class:`MDAnalysis.lib.picklable_file_io.GzipPicklable`
:class:`MDAnalysis.lib.picklable_file_io.BZ2Picklable`


.. versionadded:: 2.0.0
"""
def __getstate__(self):
return self.filename, self.use_mmap

def __setstate__(self, args):
self.__init__(args[0], mmap=args[1])
Loading