Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an OPAL "core" library for internal usage #10779

Merged
merged 2 commits into from
Sep 16, 2022

Conversation

jjhursey
Copy link
Member

@jjhursey jjhursey commented Sep 7, 2022

  • Fixes Issue mpicc / mpirun / etc. carry unnecessary external dependencies #9869
  • Split libopen-pal.so into two libraries:
    • libopen-pal_core.la : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
    • libopen-pal.la : Includes "core" plus all of the other OPAL project sources. The .so version of this is installed.
  • The "core" library contains the following:
    • opal/class
    • opal/mca/backtrace
    • opal/mca/dl
    • opal/mca/installdirs
    • opal/mca/threads
    • opal/mca/timer
    • opal/runtime/*_core.[c|h]
    • opal/runtime/opal_info_support.c
    • opal/util (most - see Makefile.am)
    • opal/util/keyval
  • The "core" library is linked into the following tools instead of the full libopen-pal.so:
    • ompi/tools/mpirun
    • ompi/tools/wrappers (by extension of opal/tools/wrappers)
    • opal/tools/wrappers
  • The opal/runtime files were divided into a 'core' set vs 'main' set

@jjhursey
Copy link
Member Author

jjhursey commented Sep 7, 2022

This branch is functional and synced to main. I think it is done, but I'm putting it as a WIP so I can bounce it off of CI to test configurations that I haven't tried. I have a few other things I want to test as well.

The wrapper compilers and mpirun are all adjusted to use the libopen-pal_core.la. I did not work on ompi_info since it'll need to load the MCA components anyway to display their registered parameters.

Early eyes and feedback on this branch as it stands now are welcome. I will only push new commits from this point, and we can squash later.

I'm out-of-the-office until next Tuesday so will resume work on this PR then.

@jjhursey
Copy link
Member Author

jjhursey commented Sep 7, 2022

Data point: Open MPI main compared to this PR. Linked in the UCX libraries (which carry dependencies).

Below are the results for mpicc, but mpirun and the other wrapper compilers are the same in their difference from main. No changes to the 3rd-party binaries (e.g., hwloc-ps, pmixcc, prterun). No change to ompi_info, as noted in the prior comment.

Open MPI main

shell$ ldd ./bin/mpicc | sort
	/lib64/ld64.so.2 (0x00007fff98750000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fff97330000)
	libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007fff974e0000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fff97860000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fff97a80000)
	libevent_core-2.1.so.6 => /lib64/libevent_core-2.1.so.6 (0x00007fff97dc0000)
	libevent_pthreads-2.1.so.6 => /lib64/libevent_pthreads-2.1.so.6 (0x00007fff97d90000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fff97400000)
	libhwloc.so.15 => /tmp/ompi-main-debug-orig/lib/libhwloc.so.15 (0x00007fff97ce0000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007fff97440000)
	libm.so.6 => /lib64/libm.so.6 (0x00007fff97ba0000)
	libnuma.so.1 => /lib64/libnuma.so.1 (0x00007fff98300000)
	libopen-pal.so.0 => /tmp/ompi-main-debug-orig/lib/libopen-pal.so.0 (0x00007fff98570000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fff971f0000)
	libpmix.so.0 => /tmp/ompi-main-debug-orig/lib/libpmix.so.0 (0x00007fff97e60000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fff98260000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fff98230000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fff972a0000)
	libucm.so.0 => /smpi_dev/jjhursey/local/ucx-1.11.2/lib/libucm.so.0 (0x00007fff982b0000)
	libucp.so.0 => /smpi_dev/jjhursey/local/ucx-1.11.2/lib/libucp.so.0 (0x00007fff98440000)
	libucs.so.0 => /smpi_dev/jjhursey/local/ucx-1.11.2/lib/libucs.so.0 (0x00007fff98340000)
	libuct.so.0 => /smpi_dev/jjhursey/local/ucx-1.11.2/lib/libuct.so.0 (0x00007fff983d0000)
	libudev.so.1 => /lib64/libudev.so.1 (0x00007fff97ab0000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00007fff97e30000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fff97300000)
	libz.so.1 => /lib64/libz.so.1 (0x00007fff973c0000)
	linux-vdso64.so.1 (0x00007fff98730000)

This PR

shell$ ldd ./bin/mpicc | sort
	/lib64/ld64.so.2 (0x00007fff9d8b0000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fff9c900000)
	libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007fff9cab0000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fff9ce30000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fff9d050000)
	libevent_core-2.1.so.6 => /lib64/libevent_core-2.1.so.6 (0x00007fff9d390000)
	libevent_pthreads-2.1.so.6 => /lib64/libevent_pthreads-2.1.so.6 (0x00007fff9d360000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fff9c9d0000)
	libhwloc.so.15 => /tmp/ompi-main-debug/lib/libhwloc.so.15 (0x00007fff9d2b0000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007fff9ca10000)
	libm.so.6 => /lib64/libm.so.6 (0x00007fff9d170000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fff9c7c0000)
	libpmix.so.0 => /tmp/ompi-main-debug/lib/libpmix.so.0 (0x00007fff9d430000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fff9d830000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fff9d800000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fff9c870000)
	libudev.so.1 => /lib64/libudev.so.1 (0x00007fff9d080000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00007fff9d400000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fff9c8d0000)
	libz.so.1 => /lib64/libz.so.1 (0x00007fff9c990000)
	linux-vdso64.so.1 (0x00007fff9d890000)

Diff

A diff shows that the mpicc generated by this PR does not include the UCX libraries or libopen-pal.so

shell$ diff /tmp/mpicc-from-main.txt /tmp/mpicc-from-this-pr.txt 
12,13d11
< libnuma.so.1
< libopen-pal.so.0
19,22d16
< libucm.so.0
< libucp.so.0
< libucs.so.0
< libuct.so.0

@rhc54
Copy link
Contributor

rhc54 commented Sep 7, 2022

I honestly have forgotten - IIRC, your "mpirun" wrapper has a dependency on OPAL only for installdirs so it can properly deal with any prefix (envar or whatever) - is that correct? I'm not sure it all really matters, but you could probably strip things down even further for just the "mpirun" wrapper if you were so inclined. Installdirs shouldn't have much in the way of dependencies (certainly not hwloc and pmix, for example).

@bosilca
Copy link
Member

bosilca commented Sep 8, 2022

#9893 ?

@bwbarrett
Copy link
Member

#9893 ?

Josh took a patch I was working on to try and work around some of the really ugly in your patch around the MCA system. I thought one of the patches was still you as author; that might have gotten boogered up, but a lot of the hard bits were from your patch.

@bosilca
Copy link
Member

bosilca commented Sep 8, 2022

I'm just trying to understand if there was a conceptual issue with the approach proposed there.

@jjhursey
Copy link
Member Author

Brian asked for help with his branch here, and I was asked to pick it up and drive it to completion for the v5 timeline. I saw some changes from George's branch were mixed in, so I assumed that Brian had already merged in the delta between the two approaches. I finished the work on Brian's branch, then squashed/rebased the branch to bring it up to the present. I added George and Brian as co-authors to the main commit due to their efforts in the branch up to the point where I picked it up.

The approach in this PR seems to fit nicely with the existing infrastructure. We could move some code around to make it a bit clearer which is "core" OPAL and which is "top/main" OPAL, but I'm hoping that the documentation (also a part of this PR) will cover that.

Note: This is still in a draft PR status because I wanted to bounce it off of CI. I'll work on bring it back for review later today.

opal/mca/if/configure.m4 Outdated Show resolved Hide resolved
@@ -63,6 +63,8 @@ static opal_output_stream_t verbose;
static char *output_dir = NULL;
static char *output_prefix = NULL;

static opal_output_pmix_cleanup_fn_t output_pmix_cleanup_fn = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unlike pmix, output is now part of the core functionality. why are we using pmix in the name of this type in the core library ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. output is part of the OPAL core, but pmix is not part of the OPAL core. The output.c code had this code in it:

ompi/opal/util/output.c

Lines 786 to 787 in b2c26c5

/* register it to be ignored */
opal_pmix_register_cleanup(filename, false, true, false);

The call was inserted to tell the host RM not to clean up this file when cleaning up session directories and whatnot when the job completes. The open_file function may be called whenever a new framework/component opens a new output handle.

So I needed to find a way to get a pointer to the opal_pmix_register_cleanup() OPAL wrapper around the PMIx_Job_control_nb PMIx function into the OPAL core without linking to it (so tools wouldn't complain about the missing symbol).

To do this I added a function registration mechanism. I added the pmix name as a reminder of how it connects.

@@ -474,6 +478,22 @@ void opal_output_set_output_file_info(const char *dir, const char *prefix, char
}
}

void opal_output_register_pmix_cleanup_fn(opal_output_pmix_cleanup_fn_t cleanup_fn)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what is the goal of this function. Reading through it seems to call the cleanup with the current registered output_pmix_cleanup_fn function and then installs a new one ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The open_file function might have been called in a few of the OPAL core frameworks/components before we can register the handler during the OPAL main initialization (which knows the PMIx symbol). This function first registers all of the previously opened files (if any) before setting the handle for any files to be opened later. It makes sure we cover them all.

@jjhursey
Copy link
Member Author

@artemry-nv Mellanox CI failed on mca_base_env_list even after #10788 (comment)

+ mca='--mca pml ucx --mca btl ^vader,tcp,openib,uct'
+ '[' 2 -gt 0 ']'
+ echo 'mca_base_env_list=XXX_A=1;XXX_B=2;XXX_C;XXX_D;XXX_E'
++ /__w/1/ompi/ompi_install/bin/mpirun --mca pml ucx --mca btl '^vader,tcp,openib,uct' --np 2 --tune /__w/1/ompi/test_amca.conf /__w/1/jenkins_scripts/jenkins/ompi/env_mpi
++ grep --count '^XXX_'
--------------------------------------------------------------------------
WARNING: The format of "mca_base_env_list" parameter is a delimited list of VAR=VAL or
VAR instances. By default, the delimiter is a semicolon: VAR1=VAL1;VAR2;VAR3=VAL3;...
You can set other via "mca_base_env_list_delimiter" parameter. If the delimiter is a
semicolon, the value of "mca_base_env_list" variable should be quoted to not interfere
with SHELL command line parsing. In the case where a value is not assigned to variable
VAR, the value will be taken from the current environment.
The following environment variable was not found in the environment:
  Variable:             XXX_C
  MCA variable value:   XXX_A=1;XXX_B=2;XXX_C;XXX_D;XXX_E
--------------------------------------------------------------------------
--------------------------------------------------------------------------
No executable was specified on the prterun command line.

Aborting.
--------------------------------------------------------------------------

I think the next test down needs to comment out as well (for now, until it is fixed)
https://github.com/mellanox-hpc/jenkins_scripts/blob/1d66874bec127c22a2248b8d0da4cb5c7bdf5158/jenkins/ompi/ompi_test.sh#L236-L243

There also look to be some other cases that exercise mca_base_env_list that might need to be checked as well.

@jjhursey
Copy link
Member Author

@artemry-nv I think the problem is this line that needs to be uncommented so that the XXX_C (and other) variable is defined for the next test.

bwbarrett and others added 2 commits September 14, 2022 10:57
 * Fixes Issue open-mpi#9869
 * Split `libopen-pal.so` into two libraries:
   - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
   - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed.
 * The "core" library contains the following:
   - `opal/class`
   - `opal/mca/backtrace`
   - `opal/mca/dl`
   - `opal/mca/installdirs`
   - `opal/mca/threads`
   - `opal/mca/timer`
   - `opal/runtime/*_core.[c|h]`
   - `opal/runtime/opal_info_support.c`
   - `opal/util (most - see Makefile.am)`
   - `opal/util/keyval`
 * The "core" library is linked into the following tools instead of the full `libopen-pal.so`:
   - `ompi/tools/mpirun`
   - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`)
   - `opal/tools/wrappers`
 * The `opal/runtime` files were divided into a 'core' set vs 'main' set

Co-authored-by: George Bosilca <bosilca@icl.utk.edu>
Co-authored-by: Brian Barrett <bbarrett@amazon.com>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
@jjhursey jjhursey marked this pull request as ready for review September 14, 2022 14:59
@jjhursey
Copy link
Member Author

I squashed the commits down to two (functional commit and documentation commit). This is ready for final review.

@janjust
Copy link
Contributor

janjust commented Sep 14, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jjhursey jjhursey merged commit 384d47d into open-mpi:main Sep 16, 2022
@jjhursey jjhursey deleted the mpi-tools-deps branch September 16, 2022 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants