-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel trajectory analysis (revisited) #618
Parallel trajectory analysis (revisited) #618
Conversation
…ent test analysis (electromagnetism.py), added parallel_jobs.py module, added test for the parallel_jobs.py module
…s has its own universe, eliminating the problem of accessing to arrays that have been already deallocated. Adapted base.py, parallel_jobs.py and test_parallel_jobs.py according to PEP8 style.
@@ -96,10 +92,14 @@ def run(self): | |||
self._prepare() | |||
for i, ts in enumerate( | |||
self._trajectory[self.start:self.stop:self.step]): | |||
self._ts = ts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this line has to go? Could we call _single_frame
with ts
as the argument?
ie.
for i, ts in enumerate(self._traj[etc]):
self._single_frame(ts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, calling _single_frame with ts as the argument looks much cleaner.
This is looking good, at the efficiency is much higher than I was expecting. |
Yes! If you or anyone have some analysis class that you think it's easy to adapt (by adding the |
…bar to track parallel computations.
…d with class ParallelProcessor has been corrected. Other minor fixes
Hi everybody! So, as @richardjgowers suggested, I modified the Let me know what you think! P.s.: I also added a basic progressbar in a folder utils within package/MDAnalysis/analysis, I hope that is fine. |
@@ -0,0 +1,102 @@ | |||
""" Add docstring later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should all go into lib.log
--- there's already ProgressMeter
. ProgressBar
would nicely complement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, moved code to lib.log
and also modified the bar with threading so that it supports working with serial jobs. Hopefully that's what you meant.
BTW did any of you have a look at joblib for this? |
package/MDAnalysis/analysis/base.py
Outdated
self._conclude() | ||
|
||
else: | ||
if threads is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer if all this code was contained inside _parallel_run
, the run function can just end up as:
def run(self, parallel=False, nthreads=None):
if not parallel:
self._serial_run()
else:
self._parallel_run(nthreads)
So _serial_run
just contains the "original" run code, and parallel run has your new code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, it's much cleaner this way. I modified the code accordingly.
…Moved the code to lib.log
…ppropriare serial or parallel method. Adapted AnalysisBase and ParallelProcessor to run with new progressbar, which now works also for serial analyses
@kain88-de seems a very interesting lib, thanks for pointing it out! It could probably solve the problem of allocating/deallocating array with positions/masses etc when using a single |
Hey Matti, This is looking good, I think next step is to fix all the things we've broke: https://travis-ci.org/MDAnalysis/mdanalysis/jobs/103233684 I think these are because you changed the signature of You should be able to run those tests locally by going to |
…ich now takes a universe and not a trajectory, and requires method _single_frame to take a timestep as argument
Good day Richard, I modified the modules that conflicted with the new signatures of There is one big issue, though. I noticed that e.g. the rdf analysis takes as input two selections Anyway, this issue can be easily solved by making sure that selections are made WITHIN the analysis object, solely in the |
Yeah I had the same thought too. I think a way around this is to store the AtomGroups within the class not as AtomGroups, but as indices (use Would this work? |
I see what you mean, but still the we do not know a priori how many selections an analysis object is going to need, nor we know which name they were given by the developer of a new analysis object, thus we wouldn't know how to "reference" them when we want to use the indices to slice a local version of |
Maybe, but if we stored them as a list called # Storing
try:
ag_indices = [ag.indices for ag in self._ags]
except AttributeError: # catches when _ags didn't exist
pass
# Rebuilding
self._ags = [self.universe.atoms[ag_ids] for ag_ids in ag_indices] So then |
@mnmelo Thank you very much! I think the shared memory approach is definitely the way to go in the future. As for now, in my opinion perhaps it would be better to test the present working version so that we can get hold of how it works with the already existing analysis routines and if there are any issues that we overlooked. Anyway, I'm just a contributor to the project, I guess the final word on how to proceed should be of a coredev. @richardjgowers what do you think? |
Yup, let's see what others say. Now it'd be a good time to hit it while the iron is hot (and admittedly, it'd push me towards actually contributing to the parallelization, which I have been lazily putting off!) |
Maybe simple tests for the parallel analysis. So for the serial AnalysisBase there's the stupid We could make a Edit: |
I think once #363 is finished, we can then do for example with three threads:
where each of our Universes shares literally the same In fact, @richardjgowers, we could even start playing with shared |
@dotsdl While encapsulation is certainly a good thing, I think it brings little extra to this case. We may have a lighter Universe, but the Topology must then also somehow make its way to the parallel workers. We end up with the same question, whether to pass the Topology by pickling or by shared memory. Or am I missing your point? |
Ok so while we decide on how to share the topology between the processes, I might start working on the tests suggested by Richard, so we can make sure future versions of the parallel code work properly. On a side node, perhaps in this case it makes more sense to use a shared memory paradigm, even though if the new universe objects are really "thin", then passing copies of the objects between main process and the workers shouldn't impact performance much? |
Yeah the tests should be independent of how we choose to make it work |
I edited FrameAnalysis and tests so that now it works also in parallel. The order of the frames is checked by the tests. I didn't get the last part about FrameSummer though. I think that as long as the parallelization strategy works in "read-only" mode, no race conditions should happen, right? |
I was trying to expose the race conditions that we were having at the start of this issue, when we only had 1 reader for many threads. You're right that our current strategy will avoid this, but as we play with different ideas, we need to know we've not accidentally gone backwards. Maybe it's a little overkill for now though. |
So do you think we should wait for the topology refactor or perhaps make this first working version available and then modify it if future changes affect it? In the second case, I might start working on docstrings. |
I think finish #670, merge the new version of develop into this branch and finish this before the refactor. The refactor won't break anything here, it will just make progressing easier. So having this as an experimental start point is a good idea. |
Retarget to 0.15? |
I vote so. |
Yeah sure |
Retargeted. I think with the discussion in #719 this will need to bake a bit longer than we have before 0.14 is released. |
Fine by me :) I'll be available once we decide to start working on this again! |
Once we pick it up again: a good candidate for parallelization is the density module: it's slow but pleasingly parallel and you can get bragging rights if you are faster than eg VMD's VolMap. |
See also https://www.mdanalysis.org/pmda/ |
With @yuxuanzhuang 's serialization of universes now possible since PR #2723 , this approach should be re-evaluated in a new PR. |
With reference to #617
MDAnalysis Discussion --> link
Branch --> link
Aim
Speed up trajectory analysis by spreading the computation on multiple threads. Slices of the trajectory are assigned to each thread, which then proceeds running AnalysisBase istances.
Current syntax (may change)
In order to run a parallel analysis, just invoke the
run()
method with the optional argumentsparallel=True
andthreads=<nthreads>
.To do
run(parallel=True)
method in theAnalysisBase
class so that single analysis jobs can be parallelized without using theParallelProcessor
class.Benchmarks
Benchmark performed on a system of N=250 k atoms (294 configurations, size ~1 GB).
Efficiency computed as the ratio between core speed (cfg s-1 threads-1) and the speed of a serial analysis.
Compatibility issues
Due to the implementation of the parallelization,
AnalysisBase
needs to take a Universe object as argument.Last update: Fri Jan 22 11:34:18 CET 2016