Skip to content

Possibly enabling oversubscribe/cpu-pinning in framework #4993

@Crivella

Description

@Crivella

Putting here some thoughts following the work I did on easybuilders/easybuild-easyblocks#3917

I think we in general do not want mpi related commands to fail due to lack of resources.

EG: it is already the case that many EC files sets environment variables to allow oversubscription of resources at steps that invoke mpirun or similar.

I've given this some thoughts and i feel like implementing this would require 2 decision:

  • Oversubscription vs(or in conjunction) to cpu-pinning
  • Have this behavior on by default or on-demand

Oversubscription vs(or in conjunction) to cpu-pinning

I think both would have their pros and cons.
Oversubscription would allow us to defer choosing the cores to the resource manager but it might not be possible to enforce max_parallel
For example in HPL the test will not run with less than 4 processes.
Let's say we set max_parallel = 2 and allow for oversubscription. If the system running EB has more than 2 CPU running

mpirun -n 4 --map-by :oversubscribe ...

will actually end up using more cpus than what was requested with max_parallel

I am still exploring the capabilities of --map-by and --hostfiles for OpenMPI, but i am not sure if it is possible to tell mpirun to never use more than a specific number of processors even if more are available without explicitly pinning the cores

On the other hand we could do something similar to what i did in easybuilders/easybuild-easyblocks#3917 so if one of the following is true we default to pinning

  • mpi is not available
  • self.cfg.parallel is lower than a specified requested value

There are a few cons here:

  • We have to know how CPUs are numbered. I am assuming we defer to hwloc so we might always be able to do a bind-to core and than give a sequential number (EG req=7 max_parallel = 3 --> --cpu-set 0,1,2,0,1,2,0) (in the HPL PR at the time of writing this i am binding all processes to 0 as i am not sure this is always reliable)
  • We risk binding to a CPU that is already in use by the machine instead of one that is currently free

I will try to investigate this more and try to come up with (hopefully not more than 2) viable implementation for this, but if there is someone more experienced that could give some ideas/feedback it would be really appreaciated

Where could this be implemented

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions