-
Notifications
You must be signed in to change notification settings - Fork 878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ompi/info: introduce support for the mpi_memory_alloc_kinds info object (II) #13055
base: main
Are you sure you want to change the base?
Conversation
e91f38b
to
9fe1afe
Compare
As a side note, I do have a bunch of tests that I developed as part of the project, and we might need to find a good spot for where to put them. The tests do require some human interaction and the output is platform dependent, but its a good starting point in my opinion. For the World model:
and for Sessions:
|
Why does HCOLL need to be excluded? |
hcoll needs to be excluded if you compile with ROCm support, they don't seem to like each other |
return num_unique; | ||
} | ||
|
||
#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to leave this debug code in the PR?
add code to handle the memkind info objects defined in MPI 4.1 Signed-off-by: Edgar Gabriel <Edgar.Gabriel@amd.com>
add an API to the accelerator component to retrieve the memory_alloc_kind information that is supported by the component. The values stored/returned are based on the side document that is about to be ratified, see https://github.com/mpi-forum/mem-alloc/blob/main/mem_alloc.tex Signed-off-by: Edgar Gabriel <Edgar.Gabriel@amd.com>
9fe1afe
to
a63155b
Compare
This PR introduces support for the
mpi_memory_alloc_kinds
andmpi_assert_memory_alloc_kinds
info objects as defined in the MPI 4.1 document and in the side document specifying the values for the three supported accelerator kinds.The logic is surprisingly twisted and I would not be surprised if some of the code here would have to be revised over time.
There are two ways on how the user can specify the
mpi_memory_alloc_kinds
info object: either as a runtime argument tompiexec
, namely-memory-alloc-kinds
, or as part of the info-object passed toMPI_Session_init
. There are two predefined memor-alloc-kinds, namelysystem
andmpi
, the latter having three potential restrictors. When user retrieves the info object e.g. usingMPI_Comm_get_info
onMPI_COMM_WORLD
, the MPI library is allowed to return more memory-kinds than requested by the user.This freedom has been applied in the following manner here:
system
and/ormpi
memory-alloc-kinds, we add them to the list of provided memkinds, in the case thempi
memory-alloc-kind fully spelled out with all three restrictors.This default memkind support is applied to all new {Comm/File/Window} objects, unless the user sets
mpi_assert_memory_alloc_kinds
during object creation (e.g.MPI_Comm_dup_with_info
orMPI_File_open
). The user can restrict the memory kinds supported by the object withmpi_assert_memory_alloc_kinds
, i.e. setting this info object will influencempi_memory_alloc_kinds
on that object. I am not sure whether there is another example in the MPI spec where providing one info object influences the value of another info object, but the MPI 4.1 specification is pretty clear in my understanding that this is what is expected to happen.Another difference in the handling of
mpi_assert_memory_alloc_kinds
is that if a provided value is not recognized, the entirempi_assert_memory_alloc_kinds
is dropped/ignored, not just the unrecognized memkind itself. (This is my reading of the specification on what is expected to happen).Lastly, there could probably be some discussion whether the location chosen for the majority of the code is appropriate (i.e. ompi/info), a directory which so-far has contained the code to manage the
MPI_Info
object in general. After some discussion I thought this is the most appropriate locations, since