From c57118582767cf085e059c2dde41442710d3f729 Mon Sep 17 00:00:00 2001 From: George Bisbas Date: Fri, 30 Jun 2023 15:54:07 +0100 Subject: [PATCH] docs: Update for Custom topology --- FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FAQ.md b/FAQ.md index ace1b72b40..258d45a505 100644 --- a/FAQ.md +++ b/FAQ.md @@ -567,7 +567,7 @@ By default, Devito compiles the generated code using flags that maximize the run ## Can I control the MPI domain decomposition -Until Devito v3.5 included, domain decomposition occurs along the fastest axis. As of later versions, domain decomposition occurs along the slowest axis, for performance reasons. And yes, it is possible to control the domain decomposition in user code, but this is not neatly documented. Take a look at `test_custom_topology` in [this file](https://github.com/devitocodes/devito/blob/master/tests/test_mpi.py). In essence, `Grid` accepts the optional argument `topology`, which allows the user to pass a custom topology as an n-tuple, where `n` is the number of distributed dimensions. For example, for a two-dimensional grid, the topology `(4, 1)` will decompose the slowest axis into four partitions, one partition per MPI rank, while the fastest axis will be replicated over all MPI ranks. +Until Devito v3.5 included, domain decomposition occurs along the fastest axis. As of later versions, domain decomposition occurs along the slowest axis, for performance reasons. And yes, it is possible to control the domain decomposition in user code, but this is not neatly documented. Take a look at `class CustomTopology` in [distributed.py](https://github.com/devitocodes/devito/blob/master/devito/mpi/distributed.py) and `test_custom_topology` in [this file](https://github.com/devitocodes/devito/blob/master/tests/test_mpi.py). In essence, `Grid` accepts the optional argument `topology`, which allows the user to pass a custom topology as an n-tuple, where `n` is the number of distributed dimensions. For example, for a two-dimensional grid, the topology `(4, 1)` will decompose the slowest axis into four partitions, one partition per MPI rank, while the fastest axis will be replicated over all MPI ranks. [top](#Frequently-Asked-Questions)