Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove lib.parallel.distances #530

Closed
orbeckst opened this issue Nov 11, 2015 · 4 comments
Closed

remove lib.parallel.distances #530

orbeckst opened this issue Nov 11, 2015 · 4 comments

Comments

@orbeckst
Copy link
Member

I had a look at our parallel implementations of distance and geometry calculations. With PR #529 I propose a backward-compatible change that enables selecting between the serial (lib._distances) and OpenMP-enabled (lib._distances_openmp) C/Cython code that @richardjgowers wrote. I don't think that the OpenMP-enabled code was actually accessible in a convenient manner so far but with PR #529 that changes (see there).

  • I did some benchmarking (on an Intel Core 2 Duo 2.66 GHz, i.e. only 2 cores) and found that the OpenMP C/Cython code performs much better than the Cython parallel code in lib.parallel.distances (which is also based on OpenMP but within cython instead of C) as shown below.
  • Additionally, lib.parallel.distances can only take orthorhombic boxes into account and breaks/produces garbage with triclinic boxes. The C/Cython code can take PBC with all boxes into account.

Therefore, I propose to remove lib.parallel.distances.

Benchmarks

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import TPR, XTC
import MDAnalysis.lib.parallel.distances as pdist
import MDAnalysis.lib.distances as dist

u = mda.Universe(TPR, XTC)
heavy = u.select_atoms("protein and not name H*")
ow = u.select_atoms("name OW")

# with result array
D = dist.distance_array(heavy.positions, ow.positions, box=u.dimensions)
D32 = D.astype(heavy.positions.dtype)

%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D, mode="serial")
1 loops, best of 3: 372 ms per loop

%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D, mode="OpenMP")
1 loops, best of 3: 203 ms per loop

%timeit pdist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D32)
1 loops, best of 3: 412 ms per loop

# without results array
%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, mode="serial")
1 loops, best of 3: 527 ms per loop

 %timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, mode="OpenMP")
1 loops, best of 3: 310 ms per loop

%timeit pdist.distance_array(heavy.positions, ow.positions, box=u.dimensions)
1 loops, best of 3: 435 ms per loop
@orbeckst
Copy link
Member Author

Now that #535 is merged and full-feature OpenMP distance routines are available, I will remove lib.parallel.distances.

I would like to simply remove it and declare it prominently in the CHANGELOG (I don't think it will break a huge amount of code). Alternatively, I can leave a stub and deprecate it but even then the new function will be a bit different from the old one.

Comments?

@richardjgowers
Copy link
Member

I'd just remove it and make sure that the new stuff is properly documented

@orbeckst
Copy link
Member Author

Ok, will do.

orbeckst added a commit that referenced this issue Nov 12, 2015
- the cython/prange-based/OpenMP versions of some of the distance calculations were
  slower and had fewer features than the C/OpenMP based ones that are now available
  with backend="OpenMP" as implemented in PR #529 so these ones (and there tests)
  are fully removed
- closes issue #530
- The whole lib.parallel module was also removed because at the moment there is nothing
  in it; we can revisit the discussion about organization of lib (issue #287) when
  necessary.
orbeckst added a commit that referenced this issue Nov 12, 2015
orbeckst added a commit that referenced this issue Nov 12, 2015
orbeckst added a commit that referenced this issue Nov 12, 2015
- closes issue #530.
- improved general docs for MDAnalysis.lib
@orbeckst orbeckst modified the milestone: 0.13 Nov 12, 2015
richardjgowers added a commit that referenced this issue Nov 13, 2015
Issue #530: remove lib.parallel.distances
@orbeckst
Copy link
Member Author

Closed by #538.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants