Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize FileIO and TextIOWrapper and Universe #2723

Merged
merged 120 commits into from
Aug 8, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
120 commits
Select commit Hold shift + click to select a range
931d9b5
Merge pull request #1 from MDAnalysis/develop
yuxuanzhuang May 20, 2020
d69ce98
Merge remote-tracking branch 'mda_origin/develop' into develop
yuxuanzhuang Jun 9, 2020
432edf3
add pickle function to fileio, textio
yuxuanzhuang Jun 9, 2020
bf55bf3
add basic test for pickle io
yuxuanzhuang Jun 9, 2020
cd9a485
add comments
yuxuanzhuang Jun 9, 2020
f33629f
xfail on python2
yuxuanzhuang Jun 10, 2020
aca4496
add doc and exception for pickle_open
yuxuanzhuang Jun 11, 2020
cd4ffe3
add doc for textio fileio class
yuxuanzhuang Jun 11, 2020
ec5bd3c
add parallel test for textio
yuxuanzhuang Jun 11, 2020
f6515ee
pep8
yuxuanzhuang Jun 11, 2020
cf29764
add an extra bufferlayer for FileIO for fast access
yuxuanzhuang Jun 11, 2020
6b09e20
Merge remote-tracking branch 'mda_origin/develop' into serialize_io
yuxuanzhuang Jun 11, 2020
c29c316
ditch py2
yuxuanzhuang Jun 11, 2020
db47e27
pep8 and doc
yuxuanzhuang Jun 11, 2020
7e9d6d3
add test for unsupported mode
yuxuanzhuang Jun 11, 2020
eb83a7c
pep8
yuxuanzhuang Jun 11, 2020
0baa868
typo
yuxuanzhuang Jun 12, 2020
c8e63b2
pickle reorder
yuxuanzhuang Jun 12, 2020
658b446
pickle_open as context manager
yuxuanzhuang Jun 12, 2020
1aa6003
format
yuxuanzhuang Jun 12, 2020
f2738bc
move pickle-io to a separate file
yuxuanzhuang Jun 12, 2020
001f3b8
doc
yuxuanzhuang Jun 12, 2020
d0374f5
FileIOPicklable class now only supports name as input, (preventing us…
yuxuanzhuang Jun 12, 2020
8c62df8
pickle open doc and add fspath for the filename
yuxuanzhuang Jun 12, 2020
94f1f8d
absolute import
yuxuanzhuang Jun 12, 2020
acbadec
more doc
yuxuanzhuang Jun 13, 2020
7043d2d
more pep8 and format
yuxuanzhuang Jun 15, 2020
546b05d
Merge remote-tracking branch 'mda_origin/develop' into serialize_io
yuxuanzhuang Jun 15, 2020
c259143
sphinx mark up
yuxuanzhuang Jun 15, 2020
a016a65
add pickle_open example
yuxuanzhuang Jun 15, 2020
401e6ae
changelog
yuxuanzhuang Jun 15, 2020
9225c71
sphinx more
yuxuanzhuang Jun 15, 2020
46c43af
add context manager approach text
yuxuanzhuang Jun 15, 2020
21fe5aa
add match for test valueerror
yuxuanzhuang Jun 15, 2020
821c822
typo
yuxuanzhuang Jun 15, 2020
5e25380
Merge branch 'develop' into serialize_io
yuxuanzhuang Jun 15, 2020
2541a3e
tell error and fileio cov
yuxuanzhuang Jun 19, 2020
1003cd3
Merge branch 'serialize_io' of https://github.com/yuxuanzhuang/mdanal…
yuxuanzhuang Jun 19, 2020
1b7a798
Merge branch 'develop' into serialize_io
yuxuanzhuang Jun 19, 2020
b79d282
remove future import
yuxuanzhuang Jun 19, 2020
d909d63
merge to develop
yuxuanzhuang Jun 19, 2020
cafc596
sphinx block code
yuxuanzhuang Jun 19, 2020
2db1ef2
typo
yuxuanzhuang Jun 20, 2020
33ef68a
Merge branch 'develop' into serialize_io
yuxuanzhuang Jun 21, 2020
24f2a34
Merge branch 'develop' into serialize_io
yuxuanzhuang Jun 22, 2020
84baca9
pickle open pdb and xyz
yuxuanzhuang Jun 23, 2020
108ebde
Merge branch 'serialize_io' of https://github.com/yuxuanzhuang/mdanal…
yuxuanzhuang Jun 23, 2020
7cb40ad
add pickle support to universe, add test, add chainreader
yuxuanzhuang Jun 24, 2020
352ab96
fix misc issues
yuxuanzhuang Jun 24, 2020
356986f
remove python2 legacy bz2
yuxuanzhuang Jun 24, 2020
e5ef732
remove fail test for offset
yuxuanzhuang Jun 24, 2020
aa6e40d
issue raised in changelog
yuxuanzhuang Jun 24, 2020
43a62d5
pep8
yuxuanzhuang Jun 24, 2020
2559625
add pickle func to ReaderBase and set offset
yuxuanzhuang Jun 26, 2020
507f8f5
add test for bz2 gzip and class check
yuxuanzhuang Jun 26, 2020
26fcfe9
add test for gsd, ncdf
yuxuanzhuang Jun 26, 2020
405a6dc
add test for trajectory.next after pickling
yuxuanzhuang Jun 26, 2020
2380a47
older gsd file
yuxuanzhuang Jun 26, 2020
dab38c1
move gsd, ncdf to coord
yuxuanzhuang Jun 29, 2020
5c07901
add chainreader state
yuxuanzhuang Jun 29, 2020
b324791
test timestep
yuxuanzhuang Jun 29, 2020
49f959d
doc
yuxuanzhuang Jun 29, 2020
773524d
add doc version change
yuxuanzhuang Jun 30, 2020
b7e4ef0
chainreader fix
yuxuanzhuang Jun 30, 2020
9d376b7
docstring error
yuxuanzhuang Jun 30, 2020
11cceb4
check dt before pickle
yuxuanzhuang Jul 1, 2020
a3130f5
add pickle test to base
yuxuanzhuang Jul 3, 2020
faf1e01
Merge branch 'develop' into serialize_io
yuxuanzhuang Jul 3, 2020
df7eb86
add chemfiles pickle
yuxuanzhuang Jul 4, 2020
72ba276
doc
yuxuanzhuang Jul 4, 2020
aa62ff0
doc add note
yuxuanzhuang Jul 5, 2020
04be63d
merge to develop
yuxuanzhuang Jul 5, 2020
f01769f
merge to develop
yuxuanzhuang Jul 5, 2020
5a2b28d
change chain getstate
yuxuanzhuang Jul 5, 2020
b5f5270
add in-line comments
yuxuanzhuang Jul 5, 2020
e1facfb
pep8
yuxuanzhuang Jul 5, 2020
cba4456
add chemfile test
yuxuanzhuang Jul 6, 2020
5622b51
pep8
yuxuanzhuang Jul 6, 2020
46cda48
raise error with mode
yuxuanzhuang Jul 7, 2020
5e2ee79
change to read_step
yuxuanzhuang Jul 8, 2020
b23b2fb
change to almost_equal
yuxuanzhuang Jul 8, 2020
cd03058
save frame
yuxuanzhuang Jul 8, 2020
3ce8ba7
save frame pep
yuxuanzhuang Jul 8, 2020
a5da2f7
add doc for pickle
yuxuanzhuang Jul 8, 2020
5a9ad4d
timestep pickle doc
yuxuanzhuang Jul 8, 2020
bc60aa7
doc serialize
yuxuanzhuang Jul 9, 2020
01fc644
doc sphinx
yuxuanzhuang Jul 10, 2020
84eb61f
pickle u with getsetstate
yuxuanzhuang Jul 10, 2020
9f18ccd
pep
yuxuanzhuang Jul 10, 2020
8d07004
Merge branch 'develop' into serialize_io
yuxuanzhuang Jul 10, 2020
e37c84a
warning on cfg
yuxuanzhuang Jul 10, 2020
2d3de99
sep files
yuxuanzhuang Jul 13, 2020
67b65d1
merge to develop
yuxuanzhuang Jul 13, 2020
18d146b
sep to two files
yuxuanzhuang Jul 13, 2020
8679e50
fixed failed merge in CHANGELOG
orbeckst Jul 14, 2020
0ceffe5
removed superfluous blank lines from CHANGELOG
orbeckst Jul 14, 2020
688041c
xdr dcd seek error
yuxuanzhuang Jul 16, 2020
204545b
Merge branch 'serialize_io' of https://github.com/yuxuanzhuang/mdanal…
yuxuanzhuang Jul 16, 2020
3c71f8a
Merge remote-tracking branch 'mda_origin/develop' into serialize_io
yuxuanzhuang Jul 16, 2020
f2239bb
current frame xdr/dcd
yuxuanzhuang Jul 16, 2020
78c93a0
Merge branch 'develop' into serialize_io
orbeckst Jul 17, 2020
c0d241e
remove tests not needed
yuxuanzhuang Jul 19, 2020
68b1c2a
pep
yuxuanzhuang Jul 19, 2020
4061434
Merge branch 'develop' into serialize_io
yuxuanzhuang Jul 20, 2020
d457491
test title more accurate
yuxuanzhuang Jul 20, 2020
4c70dcb
Merge remote-tracking branch 'mda_origin/develop' into serialize_io
yuxuanzhuang Jul 27, 2020
0496ca1
misc
yuxuanzhuang Jul 27, 2020
df061fc
gsd dim
yuxuanzhuang Jul 28, 2020
abe92da
add test for runtimee pickle
yuxuanzhuang Jul 29, 2020
b3469fe
add test for runtimee pickle
yuxuanzhuang Jul 29, 2020
b12eb0d
pep
yuxuanzhuang Jul 29, 2020
fae4797
doc pickle_reader
yuxuanzhuang Jul 29, 2020
52a981e
mock chemfiles
yuxuanzhuang Jul 30, 2020
c4ec287
chemfiles mock when not found
yuxuanzhuang Aug 1, 2020
8804e5b
doc revised
yuxuanzhuang Aug 3, 2020
c99867f
add pickle test to single_framereader
yuxuanzhuang Aug 3, 2020
a70bc8b
add pickle test to fhiams
yuxuanzhuang Aug 3, 2020
bc487a5
test doc
yuxuanzhuang Aug 6, 2020
a1bb47e
test doc title
yuxuanzhuang Aug 6, 2020
5ace1e0
test doc title 2
yuxuanzhuang Aug 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions package/MDAnalysis/core/universe.py
Original file line number Diff line number Diff line change
Expand Up @@ -318,8 +318,8 @@ class Universe(object):

.. versionchanged:: 2.0.0
Universe now can be (un)pickled.
Topology, trajectory and anchor_name are reserved upon unpickle.

``topology``, ``trajectory`` and ``anchor_name`` are reserved
upon unpickle.
"""
# Py3 TODO
# def __init__(self, topology=None, *coordinates, all_coordinates=False,
Expand Down
67 changes: 43 additions & 24 deletions package/MDAnalysis/lib/picklable_file_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,12 @@ class FileIOPicklable(io.FileIO):
Parameters
----------
name : str
a filename given a text or byte string.
either a text or byte string giving the name (and the path
if the file isn't in the current working directory) of the file to
be opened.
mode : str
only reading ('r') mode works.
only reading ('r') mode works. It exists to be consistent
with a wider API.

Example
-------
Expand Down Expand Up @@ -227,12 +230,14 @@ class BZ2Picklable(bz2.BZ2File):
Note
----
This class only supports reading files in binary mode. If you need to open
to open a compressed file in text mode, use the :func:`bz2_pickle_open`.
to open a compressed file in text mode, use :func:`bz2_pickle_open`.

Parameters
----------
name : str
a filename given a text or byte string.
either a text or byte string giving the name (and the path
if the file isn't in the current working directory) of the file to
be opened.
mode : str
can only be 'r', 'rb' to make pickle work.

Expand Down Expand Up @@ -292,7 +297,9 @@ class GzipPicklable(gzip.GzipFile):
Parameters
----------
name : str
a filename given a text or byte string.
either a text or byte string giving the name (and the path
if the file isn't in the current working directory) of the file to
be opened.
mode : str
can only be 'r', 'rb' to make pickle work.

Expand Down Expand Up @@ -334,14 +341,16 @@ def __setstate__(self, args):
def pickle_open(name, mode='rt'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not particularly like this function. It is unclear from the name that this can only be used to open files for reading. You also implicitly assume we always want to pickle a file object when we open it for reading. I would much rather have an exception when we actually try to pickle the file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we get IO classes that behave like the original ones except that they raise an error when you try to pickle with “w” but pickle fine with “r” then we still need our own open() function, right? Could we directly have anyopen() use these new classes? (Maybe you said this already somewhere...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. But yeah only using anyopen uniformly in MDAnalysis sounds good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could mark the pickle_open() function and related functions as private for now, i.e., _pickle_open(), _bz2_pickle_open(), _gsd_pickle_open() etc. We then have the freedom to change implementation in a backward-incompatible fashion even after a 2.0 release, e.g., if we want to go back to the idea to having classes that accept r and w and only fail pickling at runtime.

Or are ultimately ok with the design as it stands, @kain88-de ?

"""Open file and return a stream with pickle function implemented.

orbeckst marked this conversation as resolved.
Show resolved Hide resolved
This function returns either BufferIOPicklable or TextIOPicklable wrapped
FileIOPicklable object given different reading mode. It can be used as a
context manager, and replace the built-in :func:`open` function
in read mode that only returns an unpicklable file object.
This function returns a FileIOPicklable object wrapped in a
BufferIOPicklable class when given the "rb" reading mode,
or a FileIOPicklable object wrapped in a TextIOPicklable class with the "r"
or "rt" reading mode. It can be used as a context manager, and replace the
built-in :func:`open` function in read mode that only returns an
unpicklable file object.
In order to serialize a :class:`MDAnalysis.core.Universe`, this function
can used to open trajectory/topology files--an object composition approach,
as opposed to class inheritance, which is more flexible and easier for
pickle implementation for new readers.
can used to open trajectory/topology files. This object composition is more
flexible and easier than class inheritance to implement pickling
for new readers.

Note
----
Expand All @@ -350,7 +359,9 @@ def pickle_open(name, mode='rt'):
Parameters
----------
name : str
a filename given a text or byte string.
either a text or byte string giving the name (and the path
if the file isn't in the current working directory) of the file to
be opened.
mode: {'r', 'rt', 'rb'} (optional)
'r': open for reading in text mode;
'rt': read in text mode (default);
Expand Down Expand Up @@ -390,7 +401,7 @@ def pickle_open(name, mode='rt'):
"""
yuxuanzhuang marked this conversation as resolved.
Show resolved Hide resolved
if mode not in {'r', 'rt', 'rb'}:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually. A different idea here could be to fall back to the standard open function in case of writing.

raise ValueError("Only read mode ('r', 'rt', 'rb') "
"iles can be pickled.")
"files can be pickled.")
name = os.fspath(name)
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
raw = FileIOPicklable(name)
if mode == 'rb':
Expand All @@ -403,10 +414,12 @@ def bz2_pickle_open(name, mode='rb'):
"""Open a bzip2-compressed file in binary or text mode
with pickle function implemented.

This function returns either BZ2Picklable or TextIOPicklable wrapped
BZ2Picklable object given different reading mode. It can be used as a
context manager, and replace the built-in :func:`bz2.open` function
in read mode that only returns an unpicklable file object.
This function returns a BZ2Picklable object when given the "rb" or "r"
reading mode, or a BZ2Picklable object wrapped in a TextIOPicklable class
with the "rt" reading mode.
It can be used as a context manager, and replace the built-in
:func:`bz2.open` function in read mode that only returns an
unpicklable file object.

Note
----
Expand All @@ -415,7 +428,9 @@ def bz2_pickle_open(name, mode='rb'):
Parameters
----------
name : str
a filename given a text or byte string.
either a text or byte string giving the name (and the path
if the file isn't in the current working directory) of the file to
be opened.
mode: {'r', 'rt', 'rb'} (optional)
'r': open for reading in binary mode;
'rt': read in text mode;
Expand Down Expand Up @@ -471,10 +486,12 @@ def gzip_pickle_open(name, mode='rb'):
"""Open a gzip-compressed file in binary or text mode
with pickle function implemented.

This function returns either GzipPicklable or TextIOPicklable wrapped
GzipPicklable object given different reading mode. It can be used as a
context manager, and replace the built-in :func:`gzip.open` function
in read mode that only returns an unpicklable file object.
This function returns a GzipPicklable object when given the "rb" or "r"
reading mode, or a GzipPicklable object wrapped in a TextIOPicklable class
with the "rt" reading mode.
It can be used as a context manager, and replace the built-in
:func:`gzip.open` function in read mode that only returns an
unpicklable file object.

Note
----
Expand All @@ -483,7 +500,9 @@ def gzip_pickle_open(name, mode='rb'):
Parameters
----------
name : str
a filename given a text or byte string.
either a text or byte string giving the name (and the path
if the file isn't in the current working directory) of the file to
be opened.
mode: {'r', 'rt', 'rb'} (optional)
'r': open for reading in binary mode;
'rt': read in text mode;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ If the new reader uses :func:`util.anyopen()`
(e.g. :class:`MDAnalysis.coordinates.PDB.PDBReader`),
the reading handler can be pickled without modification.
If the new reader uses I/O classes from other package
(e.g. :class:`MDAnalysis.coordinates.GSD.GSDReader`)),
(e.g. :class:`MDAnalysis.coordinates.GSD.GSDReader`),
and cannot be pickled natively, create a new picklable class inherited from
the file class in that package
(e.g. :class:`MDAnalysis.coordinates.GSD.GSDPicklable`),
Expand Down
7 changes: 7 additions & 0 deletions testsuite/MDAnalysisTests/coordinates/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,13 @@ def test_last_slice(self):
frames = [ts.frame for ts in trj_iter]
assert_equal(frames, np.arange(self.universe.trajectory.n_frames))

def test_pickle_singleframe_reader(self):
reader = self.universe.trajectory
reader_p = pickle.loads(pickle.dumps(reader))
assert_equal(len(reader), len(reader_p))
assert_equal(reader.ts, reader_p.ts,
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
"Single-frame timestep is changed after pickling")


class BaseReference(object):
def __init__(self):
Expand Down
2 changes: 2 additions & 0 deletions testsuite/MDAnalysisTests/parallelism/test_multiprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
DMS,
DLP_CONFIG,
DLP_HISTORY,
FHIAIMS,
INPCRD,
GMS_ASYMOPT,
GMS_SYMOPT,
Expand Down Expand Up @@ -130,6 +131,7 @@ def test_universe_unpickle_in_new_process():
('DCD', DCD, dict()),
('DMS', DMS, dict()),
('CONFIG', DLP_CONFIG, dict()),
('FHIAIMS', FHIAIMS, dict()),
('HISTORY', DLP_HISTORY, dict()),
('INPCRD', INPCRD, dict()),
('LAMMPSDUMP', LAMMPSDUMP, dict()),
Expand Down