Scattering OMP parallelisation #243

rc83 · 2020-02-04T10:51:07Z

Description

Refactor and paralellise scattering analysis

OpenMP is enabled for loops in the sample methods that have O(N^2) complexity, where N is number of particles.

Furthermore, the sample methods are rewritten to allow automatic vectorisation of trigonometric functions in GCC by using libmvec vector math library when --ffast-math is enabled. This speeds up the analysis by factor of 4 on modern processors.

Checklist

No unit tests were provided.
code naming scheme follows the convention
- Constants for OMP ranges are single capital letters here, which is a grey zone of naming rules. Other rules should be honoured. Variables have been extensively renamed from a single letter physicist convention to English words to follow our programmers convention.
the source code is well documented
the user-manual is updated regarding the effective parallelisation
performance is checked in supported configurations: (GCC, Clang) × (GNU/Linux, MacOSX)

Note

The pull request includes and thus obsoletes the pull request #241.

mlund

Nice PR! Let's think of a unittest, for example just a set of fixed particles on a box.

mlund · 2020-02-04T18:42:27Z

src/analysis.cpp

        break;
    }
 }

 ScatteringFunction::ScatteringFunction(const json &j, Space &spc) try : spc(spc) {
    from_json(j);
    name = "scatter";
-    usecom = j.value("com", true);
+    use_com = j.value("com", true);
+    save_after_sample = j.value("stepsave", false); // save to disk for each sample


Add keys to docs/schema.yml

And to _docs/analysis.md

How come that this could come like this from the pull request #241? :-D Done. Documentation shall also contain ipbc. As ipbc has no meaning for debye scheme, shall we keep it as an independent attribute, or shall we have three different schemes?

yes, debye is quite different and also requres dq, qmin, qmx while these are automatic w. explicit and ipbc that om the other hand require qmax. I would keep ipbc as an option in the explicit scheme and split debye to a separate analysis.

src/io.h

src/scatter.h

src/analysis.cpp

src/scatter.h

src/analysis.cpp

mlund · 2020-02-04T19:52:57Z

src/scatter.h

+                }
+                // as of January 2020 the std::transform_reduce is not implemented in libc++
+                T sum_cos = 0;
+                for (auto &qr : qr_products) {


How well does Eigen fare for this operation? It's supposed to handle vectorisation on a number of architectures. I believe the operation here would by qr_products.sin().sum(). Note also that for the coming Eigen 4, iterators are supported.

Btw. your sin/cos optimization may be applicable also for Ewald summation:

faunus/src/energy.h

Line 127 in 58adfe9

Q += i.charge * EwaldData::Tcomplex(std::cos(dot), std::sin(dot)); // 'Q^q', see eq. 25 in ref.

Update: on GCC9/MacOS, the separate cos/sin (ifdef GNUC) are slower

The following gives a factor four speed-up on GCC and Clang:

Eigen::VectorXd qr_products; qr_products.noalias() = positions * q; T sum_sin = qr_products.array().sin().sum(); T sum_cos = qr_products.array().cos().sum();

where positions must by a MatrixXd(N,3) object. Best to await a unit-test to see if it's doing the right thing, though :-)

src/io.h

src/scatter.h

rc83 · 2020-02-06T21:26:41Z

I had to revert one commit originating from the master branch to make the pull request compile.
There is no code related to performance testing yet; neither in scatter.h nor scatter_test.h. I will prepare another commit tomorrow.

Create a merge commit is probably the best strategy how to merge this pull request.

OpenMP is enabled for loops in the sample methods that have O(N^2) complexity, where N is number of particles. Furthermore, the sample methods are rewritten to allow automatic vectorisation of trigonometric functions in GCC by using libmvec vector math library when --ffast-math is enabled. This speeds up the analysis by factor of 4 on modern processors.

Different implementations exhibit very different performance (factor of 5 observed) depending on the compiler and system environment. Add unit and performance tests.

mlund added the enhancement 💪 label Feb 4, 2020

mlund added this to the Version 2.4.0 milestone Feb 4, 2020

mlund reviewed Feb 4, 2020

View reviewed changes

mlund reviewed Feb 5, 2020

View reviewed changes

src/io.h Outdated Show resolved Hide resolved

mlund reviewed Feb 5, 2020

View reviewed changes

src/scatter.h Outdated Show resolved Hide resolved

mlund approved these changes Feb 6, 2020

View reviewed changes

rc83 force-pushed the scatter-omp branch from 667bfab to dbdd5a1 Compare February 6, 2020 21:16

rc83 force-pushed the scatter-omp branch from dbdd5a1 to c295bb4 Compare February 7, 2020 12:54

rc83 added 7 commits February 7, 2020 14:00

Add numeric_cast function template

83b13a8

Add a helper to write a map to a file

f4f6508

Add inline Geometry::Sphere::sqdist()

217bdb3

Add ipbc to scatter analysis

c762317

Fetch nanobench release

748b601

Allow SIMD, Eigen, and libm functions in scattering analysis

801ffc2

Different implementations exhibit very different performance (factor of 5 observed) depending on the compiler and system environment. Add unit and performance tests.

rc83 force-pushed the scatter-omp branch from c295bb4 to 801ffc2 Compare February 7, 2020 13:01

rc83 merged commit 280ad6c into mlund:master Feb 7, 2020

rc83 deleted the scatter-omp branch February 21, 2020 12:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scattering OMP parallelisation #243

Scattering OMP parallelisation #243

rc83 commented Feb 4, 2020 •

edited

Loading

mlund left a comment

mlund Feb 4, 2020

mlund Feb 4, 2020

rc83 Feb 6, 2020

mlund Feb 6, 2020

mlund Feb 4, 2020

mlund Feb 4, 2020 •

edited

Loading

mlund Feb 4, 2020

mlund Feb 4, 2020

rc83 commented Feb 6, 2020

Scattering OMP parallelisation #243

Scattering OMP parallelisation #243

Conversation

rc83 commented Feb 4, 2020 • edited Loading

Description

Checklist

Note

mlund left a comment

Choose a reason for hiding this comment

mlund Feb 4, 2020

Choose a reason for hiding this comment

mlund Feb 4, 2020

Choose a reason for hiding this comment

rc83 Feb 6, 2020

Choose a reason for hiding this comment

mlund Feb 6, 2020

Choose a reason for hiding this comment

mlund Feb 4, 2020

Choose a reason for hiding this comment

mlund Feb 4, 2020 • edited Loading

Choose a reason for hiding this comment

mlund Feb 4, 2020

Choose a reason for hiding this comment

mlund Feb 4, 2020

Choose a reason for hiding this comment

rc83 commented Feb 6, 2020

rc83 commented Feb 4, 2020 •

edited

Loading

mlund Feb 4, 2020 •

edited

Loading