Possibly enabling oversubscribe/cpu-pinning in framework

Putting here some thoughts following the work I did on https://github.com/easybuilders/easybuild-easyblocks/pull/3917

I think we in general do not want `mpi` related commands to fail due to lack of resources.

EG: it is already the case that many EC files sets environment variables to allow oversubscription of resources at steps that invoke `mpirun` or similar.

I've given this some thoughts and i feel like implementing this would require 2 decision:
- Oversubscription vs(or in conjunction) to cpu-pinning
- Have this behavior on by default or on-demand

## Oversubscription vs(or in conjunction) to cpu-pinning

I think both would have their pros and cons.
Oversubscription would allow us to defer choosing the cores to the resource manager but it might not be possible to enforce `max_parallel`
For example in [HPL](https://github.com/easybuilders/easybuild-easyblocks/pull/3917) the test will not run with less than 4 processes.
Let's say we set `max_parallel = 2` and allow for oversubscription. If the system running EB has more than 2 CPU running
```
mpirun -n 4 --map-by :oversubscribe ...
```
will actually end up using more cpus than what was requested with `max_parallel`

I am still exploring the capabilities of `--map-by` and `--hostfiles` for OpenMPI, but i am not sure if it is possible to tell `mpirun` to never use more than a specific number of processors even if more are available without explicitly pinning the cores

On the other hand we could do something similar to what i did in https://github.com/easybuilders/easybuild-easyblocks/pull/3917 so if one of the following is true we default to pinning
- mpi is not available
- `self.cfg.parallel` is lower than a specified requested value

There are a few cons here:
- We have to know how CPUs are numbered. I am assuming we defer to `hwloc` so we might always be able to do a `bind-to core` and than give a sequential number (EG `req=7`   `max_parallel = 3`  -->  `--cpu-set 0,1,2,0,1,2,0`) (in the HPL PR at the time of writing this i am binding all processes to 0 as i am not sure this is always reliable)
- We risk binding to a CPU that is already in use by the machine instead of one that is currently free


I will try to investigate this more and try to come up with (hopefully not more than 2) viable implementation for this, but if there is someone more experienced that could give some ideas/feedback it would be really appreaciated

## Where could this be implemented

- If we decide this behavior should always be on this could be done at TC loading by setting the appropriate env variables depending on the MPI family.
- An alternative could be to modify https://github.com/easybuilders/easybuild-framework/blob/develop/easybuild/tools/toolchain/mpi.py#L273 to add an extra parameter `oversubscribe`, which means we would than need to properly enforce that every MPI command is generated through this helper function

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possibly enabling oversubscribe/cpu-pinning in framework #4993

Oversubscription vs(or in conjunction) to cpu-pinning

Where could this be implemented

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possibly enabling oversubscribe/cpu-pinning in framework #4993

Description

Oversubscription vs(or in conjunction) to cpu-pinning

Where could this be implemented

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions