add support for mpi4py #190

basnijholt · 2019-04-30T19:05:59Z

No description provided.

adaptive/runner.py

dalcinl · 2019-05-01T09:50:48Z

adaptive/runner.py

@@ -693,6 +700,8 @@ def _get_ncores(ex):
        return 1
    elif with_distributed and isinstance(ex, distributed.cfexecutor.ClientExecutor):
        return sum(n for n in ex._client.ncores().values())
+    elif with_mpi4py and isinstance(ex, mpi4py.futures.MPIPoolExecutor):
+        return mpi4py.MPI.COMM_WORLD.size - 1


ex.bootup() # wait until all workers are up and running return executor._pool.size # not public API!

That's better, does ex._pool.size work before all the workers are up and running? Because Adaptive can handle scaling of the pool size.

jbweston · 2019-05-01T16:41:06Z

does this "just work"? Isn't there some extra bits needed for launching workers? We should probably document this somehow...

basnijholt · 2019-05-01T18:12:28Z

@jbweston I'll add some more details to the docs later.

In a nutshell, it works when calling your Python script like:

mpiexec -n 16 python -m mpi4py.futures run_learner.py

or in a SLURM job

srun -n $SLURM_NTASKS --mpi=pmi2 python -m mpi4py.futures run_learner.py

dalcinl · 2019-05-03T20:18:45Z

In a nutshell, it works when calling your Python script like:
mpiexec -n 16 python -m mpi4py.futures run_learner.py

In your desktop or laptop, it can also work like this:

export MPI4PY_MAX_WORKERS=15
mpiexec -n 1 python run_learner.py

Or you can pass max_workers=15 programmatically when creating the executor instance.

In this case, the 15 workers will be MPI-spawned at runtime. I consider this the preferred way of using mpi4py.futures, unfortunately it is not always supported by batch systems or vendor MPI implementations in supercomputers. If your code uses no more than one executor instance at a time, then both methods are practically equivalent. The difference is when you create a second executor, in the first form, all executors share all the workers, in the second form (spawn), each executor has its own set of workers. BTW, this is explained in the docs! Folks, for once in my life that I care to write docs, and you do not RTFM? Come on! 😉

dalcinl · 2019-05-03T20:21:38Z

does this "just work"?

You are hurting my feelings 😉

basnijholt · 2019-05-06T18:02:32Z

@dalcinl thanks for the comments!

I've updated the explanation for the docs.

@akhmerov or @jbweston merge if you are happy with it.

jbweston · 2019-05-07T09:20:16Z

LGTM

basnijholt force-pushed the mpi4py_support branch from e4f3c1e to 45d6107 Compare April 30, 2019 22:22

dalcinl reviewed May 1, 2019

View reviewed changes

adaptive/runner.py Outdated Show resolved Hide resolved

dalcinl reviewed May 1, 2019

View reviewed changes

add support for mpi4py

d538ee2

basnijholt force-pushed the mpi4py_support branch from 3f25da0 to a6dd4c9 Compare May 1, 2019 15:28

basnijholt force-pushed the mpi4py_support branch from a6dd4c9 to be41efa Compare May 6, 2019 18:00

add mpi4py to the docs

bb319d9

basnijholt force-pushed the mpi4py_support branch from be41efa to bb319d9 Compare May 6, 2019 18:01

jbweston merged commit abc0f0e into master May 7, 2019

basnijholt mentioned this pull request May 7, 2019

release v0.8.0 #165

Closed

basnijholt deleted the mpi4py_support branch May 8, 2019 23:14

basnijholt mentioned this pull request Aug 26, 2019

Inquiry on implementation of parallelism on the cluster #208

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for mpi4py #190

add support for mpi4py #190

basnijholt commented Apr 30, 2019 •

edited

Loading

dalcinl May 1, 2019

basnijholt May 1, 2019

jbweston commented May 1, 2019

basnijholt commented May 1, 2019 •

edited

Loading

dalcinl commented May 3, 2019

dalcinl commented May 3, 2019

basnijholt commented May 6, 2019

jbweston commented May 7, 2019

add support for mpi4py #190

add support for mpi4py #190

Conversation

basnijholt commented Apr 30, 2019 • edited Loading

dalcinl May 1, 2019

Choose a reason for hiding this comment

basnijholt May 1, 2019

Choose a reason for hiding this comment

jbweston commented May 1, 2019

basnijholt commented May 1, 2019 • edited Loading

dalcinl commented May 3, 2019

dalcinl commented May 3, 2019

basnijholt commented May 6, 2019

jbweston commented May 7, 2019

basnijholt commented Apr 30, 2019 •

edited

Loading

basnijholt commented May 1, 2019 •

edited

Loading