Skip to content

Conversation

ggouaillardet
Copy link
Contributor

add the OPAL_DEFINE_PACKAGE helper that define
--with-FOO=DIR
--with-FOO-cppflags=FLAGS
--with-FOO-ldflags=FLAGS
--with-FOO-libdir=DIR (deprecated)

OPAL_CHECK_PACKAGE2(prefix,
header,
library,
function,
extra-libraries,
package,
[action-if-found], [action-if-not-found],
includes)

is much easier to use and handle most of the work under the hood.

Signed-off-by: Gilles Gouaillardet gilles@rist.or.jp

@ggouaillardet
Copy link
Contributor Author

Refs #4722

@jsquyres this is a very early stage prototype.

long story short, we now have
--with-FOO=DIR --with-FOO-cppflags=CPPFLAGS --with-FOO-ldflags=LDFLAGS
--with-FOO-libdir=DIR is also available for convenience/compatibility but issues a warning as a deprecated option.

zlib and libevent:external uses the new OPAL_CHECK_PACKAGE2 macro and OPAL_DECLARE_PACKAGE helper.

if both --with-FOO-cppflags=CPPFLAGS and --with-FOO-ldflags=LDFLAGS are used, then the user only test what he asked for.
Otherwise, some checks are automated

  • do not test if --with-FOO=no
  • look for include files in DIR/include
  • try libs in DIR/lib and DIR/lib64

can you please share your thoughts on this new macro ?

the code itself is very lightly tested so do not worry too much about it at this stage.

@ggouaillardet ggouaillardet force-pushed the topic/opal_check_package branch from 4c1b304 to f49caf3 Compare February 16, 2018 12:24
@ggouaillardet
Copy link
Contributor Author

:bot:aws:retest

@ggouaillardet ggouaillardet force-pushed the topic/opal_check_package branch from f49caf3 to e290767 Compare February 16, 2018 13:50
@rhc54
Copy link
Contributor

rhc54 commented Feb 16, 2018

@ggouaillardet From what I can tell, this doesn't solve the problem we've been talking about on #4818 where one version of the package is installed in a standard location, and the user specifies a different version installed in a custom location. No matter how you set CPPFLAGS or LDFLAGS, the autoconf macros AC_CHECK_HEADER and AC_SEARCH_LIB will still pickup the version in the standard location and return success.

I've tried every trick I can find to get around that issue, with no success. Finally posted a question on StackOverflow to see if anyone has an idea. Bottom line so far: all you really can do is check for file and library existence in this scenario.

What is weird is that if you get the foo_CPPFLAGS and foo_LDFLAGS set correctly, then the component does get linked to the right version of the package! You just can't get autoconf to respect it when doing the checks.

@ggouaillardet ggouaillardet force-pushed the topic/opal_check_package branch from e290767 to a07d454 Compare February 16, 2018 14:53
@ggouaillardet
Copy link
Contributor Author

@rhc54 I am not sure we read each other correctly.

The issue in #4818, is that if pmix.h is both available in /usr/include/pmix.h and /opt/include/pmix.h, and you configure --with-pmix=/opt, then AC_CHECK_HEADER will be first invoked with CPPFLAGS=-I/opt and will use /usr/include/pmix.h and success, which is not what we expect.
If AC_CHECK_HEADER is invoked with CPPFLAGS=-I/opt/include, then /opt/include/pmix.h will be tested (and not /usr/include/pmix.h), so I do not see any issue with that.
If the user now configure --with-pmix-cppflags=-I/opt or configure --with-pmix-cppflags=/bogus, then AC_CHECK_HEADER will use /usr/include/pmix.h and success. To me, this is an end user error, so from the Open MPI point of view, i am fine with that.

I might have chosen my words poorly previously, and what I really meant by

if both --with-FOO-cppflags=CPPFLAGS and --with-FOO-ldflags=LDFLAGS are used, then the user only test what he asked for.

is Open MPI passes CPPFLAGS and LDFLAGS directly to AC_CHECK_HEADER and AC_SEARCH_LIB period.
Note that if the end user configure --with-pmix=/opt, the check will fail if /opt/include/pmix.h does not exist.

It is of course possible to parse $with_pmix_cppflags and test for the existence of pmix.h.
If we want to follow that road, I'd rather have --with-FOO-incdir=DIR[:DIR]* instead of --with-FOO-cppflags=CPPFLAGS

@rhc54
Copy link
Contributor

rhc54 commented Feb 16, 2018

@ggouaillardet I don't believe the following is true:

is Open MPI passes CPPFLAGS and LDFLAGS directly to AC_CHECK_HEADER and AC_SEARCH_LIB period.
Note that if the end user configure --with-pmix=/opt, the check will fail if /opt/include/pmix.h does not exist.

From my testing, AC_CHECK_HEADER and AC_SEARCH_LIB always include the standard locations, regardless of the CPPFLAGS and LDFLAGS that are passed. I have been unable to prevent them from doing so.

@ggouaillardet
Copy link
Contributor Author

@rhc54 miscommunication again, and I can take the blame for that one

Note that if the end user configure --with-pmix=/opt, the check will fail if /opt/include/pmix.h does not exist.

What I really meant is that configure --with-FOO=DIR invoked without --with-FOO-cppflags=CPPFLAGS , the new Open MPI macro will only invoke AC_CHECK_HEADER with CPPFLAGS=-IDIR/include if DIR/include/foo.h is readable, and fail otherwise.

when configure --with-FOO=DIR is invoked without --with-FOO-ldflags=LDFLAGS, then AC_SEARCH_LIB is only invoked with LDFLAGS=-LDIR/lib[64] if the directory exists (note, Open MPI does not test the existence of the library here)

@rhc54
Copy link
Contributor

rhc54 commented Feb 17, 2018

Understood - but it still won't work ☹️

The problem is that AC_CHECK_HEADER will see the header in the std location, and that is what will be checked - not the one in your DIR/include location. Ditto for AC_SEARCH_LIB.

The documentation implies this behavior as it states that the purpose of these functions is to find an instance of the search condition, not a specific instance of it. For example, when explaining AC_SEARCH_LIB:

The correct macro to use for this task is AC_SEARCH_LIBS, which is designed keeping into
consideration at least two important points:

There might be no need for further libraries to be added for the function to be available. This may
either be because the function is in the C library or because the library that it's found in is already
present in the LIBS variable's list. This does not mean the function is present in a library called libc;

Only one library carrying the function is needed, so testing should stop at the first hit. Testing further
libraries might very well lead to false positives and will certainly slow down the configuration step.

So the goal of these search routines is to look around the system at the std locations and any locations given by FLAGS until they find the first instance where the search is satisfied. Once that happens, they return "found" and are done.

We have been "getting lucky" all these years because:

  • if the search came back positive, we manually defined the FLAGS using the user-provided location
  • we learned early on to always put the test for "DIR/lib" before the test for "DIR/lib64", and almost everyone installs to the lib directory (so we were almost always correct). Where they didn't, we told people to specify the --with-foo-libdir option, and then we manually built the FLAGS to point there
  • if we were looking for a specific version, we could sometimes find a unique function for that version that would disqualify the std installation, especially on HPC systems that often stay several releases back
  • and most importantly, libtool is much more sophisticated then autoconf in this regard. It looks at the LDFLAGS and links the libraries based on that first, only then going to the std locations. So once we manually built the FLAGS, libtool saved us by linking to the right library.

What I suggest is that we only use AC_CHECK_HEADERS and AC_SEARCH_LIBS when we are looking for something in the std locations. Otherwise, we should look for existence of the actual file/library (not just the directory). If we want to check the library for a given function, we can do that using nm or its equivalent for the given system.

@ggouaillardet
Copy link
Contributor Author

@rhc54 here are my observations on OS X, with configure --with-zlib=/usr/local/Cellar/zlib/1.2.11 (e.g. use zlib 1.2.11 from homebrew instead of the default zlib (/usr/include/zlib.h and /usr/lib/libz.dylib)

from config.log

configure:10418: checking zlib.h usability
configure:10418: gcc -c   -I/usr/local/Cellar/zlib/1.2.11/include conftest.c >&5
configure:10418: $? = 0
configure:10418: result: yes
configure:10418: checking zlib.h presence
configure:10418: gcc -E  -I/usr/local/Cellar/zlib/1.2.11/include conftest.c
configure:10418: $? = 0
configure:10418: result: yes
configure:10418: checking for zlib.h
configure:10418: result: yes
configure:10515: result: looking for library in lib
configure:10517: checking for library containing deflate
configure:10548: gcc -o conftest   -I/usr/local/Cellar/zlib/1.2.11/include  -L/usr/local/Cellar/zlib/1.2.11/lib conftest.c  >&5
Undefined symbols for architecture x86_64:
  "_deflate", referenced from:
      _main in conftest-e22220.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:10548: $? = 1
configure:10548: gcc -o conftest   -I/usr/local/Cellar/zlib/1.2.11/include  -L/usr/local/Cellar/zlib/1.2.11/lib conftest.c -lz -lz  >&5
configure:10548: $? = 0
configure:10565: result: -lz
configure:11138: checking will zlib support be built
configure:11144: result: yes

bottom line

  • Open MPI checks /usr/local/Cellar/zlib/1.2.11/include/zlib.h is readable, and then invoke AC_CHECK_HEADERS with CPPFLAGS=-I/usr/local/Cellar/zlib/1.2.11/include. The macro does test /usr/local/Cellar/zlib/1.2.11/include/zlib.h and not /usr/include/zlib.h, so for me, everything works as expected regarding the header files.
  • Open MPI checks the existence of /usr/local/Cellar/zlib/1.2.11/lib and then invokes AC_SEARCH_LIB with LDFLAGS=-L/usr/local/Cellar/zlib/1.2.11/lib. The first gcc is invoked without -lz and fails, and the second one with -lz -lz (I will double check why twice later...) success.
    At first glance, it worked since /usr/local/Cellar/zlib/1.2.11/lib/libz.dylib was used for the test. If I give it a closer look, that could have failed if
    1. the deflate symbol was provided by someone else such as libc and hence does not require -lz (right now, I do not know what Open MPI can do about that)
    2. /usr/local/Cellar/zlib/1.2.11/lib/libz.dylib did not exist, was unreadable or with the wrong bitness. In this case, i guess /usr/lib/libz.dylib would have been tested instead (Open MPI can be enhanced and perform extra tests to handle some if not all scenario)
  • the library detection is less ideal than the header detection, but in this case, we never test something in /usr/{include,lib} instead of /usr/local/Cellar/zlib/1.2.11/{include,lib}.

What I suggest is that we only use AC_CHECK_HEADERS and AC_SEARCH_LIBS when we are looking for something in the std locations. Otherwise, we should look for existence of the actual file/library (not just the directory). If we want to check the library for a given function, we can do that using nm or its equivalent for the given system.

based on my observations (e.g. sh -x configure and not the autotools documentation), and at this stage, I do not believe we need to go that far, and this PR (plus a few extra tests) looks good enough to me to handle all real-life cases.
If you can describe a real-life case that is still not handled properly by this PR (module the upcoming extra tests), I will be happy to retract what I just wrote.

@rhc54
Copy link
Contributor

rhc54 commented Feb 17, 2018

Well, I respectfully disagree with your conclusion. I think you are getting fooled by a false positive.

Think about it for a bit and you'll begin to understand why. AC_CHECK_HEADERS and AC_SEARCH_LIBS both use compile tests to check for existence - you can see that in the output you posted. Now consider how a compiler works.

If you pass a CPPFLAGS value to a compiler, it looks in that location plus the standard locations for any include file. If it didn't do that, we'd all go nuts chasing down all the stuff like stdlib.h to add them back into CPPFLAGS.

So your test for zlib.h would have passed just as well if you had given a totally bogus path for --with-zlib, unless you asked it to check for a definition that doesn't exist in the version of zlib.h installed in the standard location. Sure, you'll see your CPPFLAGS on the cmd line, but the compiler is going to find zlib.h regardless so long as it is installed in /usr/local/include.

The only way to ensure you only check the given location is to avoid compile-based tests.

@rhc54
Copy link
Contributor

rhc54 commented Feb 17, 2018

Figured I'd try to clarify a bit more and then surrender 😄

My basic point I'm trying to make here is: once you have manually checked for existence and readability of the header file, running AC_CHECK_HEADERS doesn't do anything for us. It will always succeed if the header exists in a std location, regardless of what we pass for CPPFLAGS, so why waste the time running it?

The same is true for AC_SEARCH_LIBS - if the library we are looking for is in a std location (and the function is in that version), then AC_SEARCH_LIBS will always succeed, regardless of what we pass for LDFLAGS. The only thing the macro does for you is check that the specified function is a public symbol in at least one of the versions of the library in a combination of the std location plus the given LDFLAGS - it doesn't tell you that the version of the library in your specified location was the one that provided it.

So if we are given a non-std location, we might as well just test for existence and be done - unless we have some reason to believe that the specified package is not installed in a std location. I suppose we could run the macros without any FLAGS and see if we get a positive response - if we do, then just do the existence check on the given location. I'm not sure what that adds, hence my proposal to just check for existence and then trust the user.

HTH

@rhc54
Copy link
Contributor

rhc54 commented Feb 18, 2018

FWIW: only one reply so far to the StackOverflow issue - they suggested looking at boost for an example where the desired behavior appeared to be achieved. However, looking at their code:

        # We are going to check whether the version of Boost installed
        # in $boost_inc is usable by running a compilation that
        # #includes it.  But if we pass a -I/some/path in which Boost
        # is not installed, the compiler will just skip this -I and
        # use other locations (either from CPPFLAGS, or from its list
        # of system include directories).  As a result we would use
        # header installed on the machine instead of the /some/path
        # specified by the user.  So in that precise case (trying
        # $boost_inc), make sure the version.hpp exists.
        #
        # Use test -e as there can be symlinks.
        test -e "$boost_inc/boost/version.hpp" || continue
        CPPFLAGS="$CPPFLAGS -I$boost_inc"

So they first have to verify existence for the same reason we do. In their case, they run AC_COMPILE_IFELSE to verify that the header is okay, though it seems a fairly useless test unless you want to ensure that the header is from a minimal version and can test some function that didn't exist in an earlier one.

The StackOverflow ticket is here: https://stackoverflow.com/questions/48829078/autoconf-check-a-specific-library

I'll keep tracking it in case someone pipes up with a better answer, but I'm beginning to believe there is no really good solution. It seems to me that the only real thing we can do is check for existence to catch a typo, and then trust the user.

add the OPAL_DEFINE_PACKAGE helper that define
 --with-FOO=DIR
 --with-FOO-cppflags=FLAGS
 --with-FOO-ldflags=FLAGS
 --with-FOO-libdir=DIR (deprecated)

OPAL_CHECK_PACKAGE2(prefix,
                    header,
                    library,
                    function,
                    extra-libraries,
                    package,
                    [action-if-found], [action-if-not-found],
                    includes)

is much easier to use and handle most of the work under the hood.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
@ggouaillardet ggouaillardet force-pushed the topic/opal_check_package branch from a07d454 to 6b3dce2 Compare February 19, 2018 06:39
@bwbarrett
Copy link
Member

I think you're both right. Ralph's right, the AC_CHECK_HEADERS doesn't tell us precisely what we want to know (same with AC_SEARCH_LIBS), in that if the header's in the standard location and the user specifies --with--includes, the user can get fooled.

Giles is right that we should be calling both AC_CHECK_HEADERS and AC_SEARCH_LIBS at least once on every file to sanity check that they should compile. Our current system of calling multiple times doesn't work the way we want, but we should still check that the header can compile. There are cases where this isn't true, such as only supporting 64 bit compiles when OMPI is being built with -m32.

I haven't had a chance to look at this patch in depth; will do so today.

@rhc54
Copy link
Contributor

rhc54 commented Feb 20, 2018

I actually agree on the AC_CHECK_HEADERS part. The AC_SEARCH_LIBS, however, won't actually do anything for us as there is no guarantee that it even used the library we want checked. If our target library fails, then the search will merrily pickup the lib in the std location - and if that one passes, then the test succeeds.

From the references provided so far (and that I could find myself), it appears that all you really can do is:

  • check that the lib file exists in the provided location
  • tell libtool to use that one by setting the LDFLAGS to point at it

You can check for a specific function using AC_CHECK_LIB, but again, that only tells you that some library in either the indicated search path or in std locations has that function - it doesn't tell you that it is present in the lib you pointed at.

This seems to be a pretty common problem, and the above solution appears to be where everyone winds up - often after lots of agonizing ☹️

@bwbarrett
Copy link
Member

@rhc54, yeah, sorry, I meant AC_CHECK_LIB, not AC_SEARCH_LIBS. One should at least sanity check that something exists, although there's not more than that we can do.

Copy link
Member

@bwbarrett bwbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have strong objections to the half-change. We should either update all the CHECK_PACKAGE calls or none at all. I haven't had time to do this patch not because opal_check_package was hard to write, but because updating all the configure.m4 files was going to take some time. But it's the right thing to do.

Also, we shouldn't add opal_check_package2, but should re-define opal_check_package to do what we want, otherwise, we'll have two behaviors and nobody wins.

@@ -0,0 +1,237 @@
dnl -*- shell-script -*-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either -- autoconf -- or nothing, please

# so this sucks, but there's no way to get through the progression
# of header includes without killing off the cache variable and trying
# again...
unset opal_Header
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't need this code with the (proper) change to only checking the header / library location once.

# again...
unset opal_Header

# get rid of the trailing slash(es)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

AC_CHECK_HEADERS([$2], [opal_check_package_header_happy="yes"], [])
AS_IF([test "$opal_check_package_header_happy" = "no"],
[# no go on the as is - reset the cache and try again
unset opal_Header])],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

AC_CHECK_HEADERS([$2], [opal_check_package_header_happy="yes"], [], [$6])])

AS_IF([test "$opal_check_package_header_happy" = "yes"], [$4], [$5])
unset opal_check_package_header_happy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

dnl additonal -l arguments to link successfully, list them here.
dnl * dir_prefix: if the header/library is located in a non-standard
dnl location (e.g., /opt/foo as opposed to /usr), list it here
dnl * libdir_prefix: if the library is not under $dir_prefix/lib or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't match prototype.

dnl * action_if_found: if both the header and library are found and
dnl usable, execute action_if_found
dnl * action_if_not_found: otherwise, execute action_if_not_found
dnl * extra_includes: if including header_filename requires additional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't match name of argument in function above. Also, why on a new function is this after the action_if_found / action_if_not_found, which are traditionally the last two arguments?

opal_check_package_$1_orig_LDFLAGS="$$1_LDFLAGS"
opal_check_package_$1_orig_LIBS="$$1_LIBS"

AS_IF([test "$with_$6" != "no"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code would be a lot simpler if you defined a variable foobar=$with_$6 and used that in all the tests instead of constantly doing the m4 expansion thing.

opal_check_package_$1_orig_LIBS="$$1_LIBS"

AS_IF([test "$with_$6" != "no"],
[AS_IF([test -n "$with_$6_cppflags" || test -z "$with_$6" || test "$with_$6" = "yes"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this is here. It's literally the union of all options != no.

AC_DEFINE_UNQUOTED([OPAL_HAVE_ZLIB], [$opal_zlib_support],
[Whether or not we have zlib support])
OPAL_VAR_SCOPE_POP

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not remove the whole line?

@ggouaillardet
Copy link
Contributor Author

@bwbarret thanks for the review !

I agree that at the end there should be only one OPAL_CHECK_PACKAGE macro.
In its current state, this PR is really a proof of concept, and if/when we reach a consensus on the the new OPAL_CHECK_PACKAGE macro, I will be happy to change all the m4 files.

I will make the requested changes from now.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
@ggouaillardet
Copy link
Contributor Author

@bwbarrett I pushed a new commit that should address your review, plus some more revamp.

@rhc54
Copy link
Contributor

rhc54 commented Feb 23, 2018

@ggouaillardet I thought we were going to revise the lib section to avoid use of AC_SEARCH_LIB and use AC_CHECK_LIB instead, and only after checking for existence of the specified lib in the target location? As written, this looks like it will still generate a false positive every time you check for -LDIR/lib if the package exists in a std location.

@ggouaillardet
Copy link
Contributor Author

I will update the PR, thanks !

only use AC_SEARCH_LIBS when --with-FOO[=yes]
if --with-FOO-ldflags=LDFLAGS or --with-FOO=DIR is used, then use AC_CHECK_LIB

if --with-FOO=DIR is used, the macro is only tried if 'ls DIR/lib[64]/liblib.*'
returns more than one file. This is not yet bulletproof since
- if DIR/lib/libLIB.xyz is a match, the linker will not use that and will try lib
in the default path instead.
- if DIR/lib/libLIB.so has the wrong bitness/arch, the linker will ignore it and
try lib in the default path instead

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
@ggouaillardet
Copy link
Contributor Author

@rhc54 i updated the PR based on your latest message.

@rhc54
Copy link
Contributor

rhc54 commented Mar 7, 2018

what happened to the code that removed trailing '/'? I think we really need it, and no point in us putting it everywhere.

Copy link
Member

@bwbarrett bwbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to still not be on the same page; we're supposed to be ripping out a lot of the complicated code we still have in here progressing through search libraries.

AC_ARG_WITH([$1], [AC_HELP_STRING([--with-$1=DIR], [$2])])
AC_ARG_WITH([$1-cppflags], [AC_HELP_STRING([--with-$1-cppflags=FLAGS], [$3])])
AC_ARG_WITH([$1-ldflags], [AC_HELP_STRING([--with-$1-ldflags=FLAGS], [$4])])
AC_ARG_WITH([$1-libdir], [AC_HELP_STRING([--with-$1-libdir=DIR],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to split OPAL_DECLARE_PACKAGE and OPAL_DECLARE_PACKAGE_DEPRECATED, with only _DEPRECATED adding --with-$1-libdir. Otherwise, we end up adding a deprecated argument for new components, which is weird.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point, I will update this.

dnl [action-if-found], [action-if-not-found])
dnl --------------------------------------------------------------------
AC_DEFUN([_OPAL_CHECK_PACKAGE2_HEADER], [
AS_VAR_PUSHDEF([opal_With], [with_$1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This capitalization kind of makes me ragey. Why the semi-CamelCase, with underscores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opal_Lib and opal_Header have been around for a while, but I get your point and will update this too.

AC_CHECK_HEADERS([$3], [$5], [$6], [$4])],
[AS_IF([test -z "$opal_With" || test "$opal_With" = "yes"],
[AC_CHECK_HEADERS([$3], [$5], [$6], [$4])],
[AS_IF([test -r "$opal_With/$3"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed we were getting rid of this, didn't we? We were only going to check with_dir/include if with_cppflags wasn't set. There should only need to be a single call to AC_CHECK_HEADERS in this function, with a simple set if if statements to set CPPFLAGS as needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current code does not check DIR nor DIR/include if --with-FOO-cppflags is set, did I get that part wrong ?
Also, the current code fails if --with-FOO-cppflags is not used and nor DIR/header.h nor DIR/include/header.h exists. If we move to a single AC_CHECK_HEADERS invokation, handling this bozo case might be a bit convoluted. I am open to suggestions though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggouaillardet I think the use of the test -n option is what confused here - I honestly don't think we have used that variation elsewhere. Nothing wrong with it - just something I had to look up (and ask about), and still trip over until I get used to it.

The code looks generally correct to me. If they provide the cppflags option, then we just use it. If not, then we check std locations if opal_with is empty or "yes", then DIR, and then DIR/include. The only thing I believe we might change is to reduce the complexity a bit.

I believe what @bwbarrett was getting at is that you could use all those AS_IF's to simply set a local cppflags value, and then set CPPFLAGS and have a single call to AC_CHECK_HEADERS at the end. Would significantly simplify the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test -n didn't confuse me. I was saying that the code structure is way more complicated than it needs to be. It should be something like:

AS_IF([test -n "$with_FOO_cppflags"], [CPPFLAGS=$with_FOO_cppflags],
  AS_IF([test -n "$with_FOO" != "yes"], [CPPFLAGS=$with_foo/include])])
AC_CHECK_HEADERS([....])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhc54 @bwbarrett my question was how to elegantly handle the case in which --with-FOO-cppflags is not set and --with-FOO=DIR is used and nor DIR/foo.h nor DIR/include/foo.h exist.
Ideally, it would fail without aborting nor invoking AC_CHECK_HEADERS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that ensuring DIR/include/foo.h (and remember, we're NOT checking for DIR/foo.h) is important. If you feel strongly that it is for the case of --with-dir, I'd do something like:

if test -n "$with_FOO_cppflags" ; then
   add_CPPFLAGS="$with_FOO_cppflags"
elif test "$with_FOO" != "yes"; then
    if test ! -r "$with_FOO/include/$header_file"; then
         AC_MSG_ERROR([Didn't find expected header <blah> in <blah>.])
    fi
   add_CPPFLAGS="-I$with_FOO/include"
fi
CPPFLAGS=$CPPFLAGS $add_CPPFLAGS
AC_CHECK_HEADERS

Again, this all got so complicated it had a billion unintended side effects. Our whole goal is to get back to simple, even if we miss some corner cases in testing; we can define what we do, how it doesn't do what the user may want, but that the important part is IT'S DEFINED.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh??? Part of this whole thing was to ensure that we do check for DIR/foo.h, and then for DIR/include/foo.h, as that is where multiple vendors install by default. I don't understand why that is objectionable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's terrible. --with-FOO should never add DIR/foo.h. The whole point of this change is to add --with-FOO-cppflags so that if the header was in an odd spot, customers could add the right CPPFLAGS to get what they need. DIR/foo.h adds a really dangerous -I flag in all but some corner-case situations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I give up - this has morphed into something that doesn't appear to meet my needs at all. I'll just hardcode what I need into the .m4's where I need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwbarett One motivation for the change is to allow users to install their include files in a non standard location (in this case, it was /usr/include/pmix instead of /usr/include), so we cannot assume we only need to append /include to DIR and look for headers there.

That being said, and since --with-FOO-cppflags and --with-FOO-ldflags are new options, the new macro will have to differ between master and the release branches for a while. As far as I am concerned, I would be fine testing for DIR/foo.h only in the release branches, and master not testing it (since --with-FOO-cppflags can be used to achieve the same goal).

@rhc54 is this acceptable for you ?

# so this sucks, but there's no way to get through the progression
# of search libs without killing off the cache variable and trying
# again...
unset opal_Lib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, we were going to get rid of all this...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am puzzled. do we want to check DIR/lib64 when there is stuff in DIR/lib but AC_CHECK_LIB fails ? If yes, I am afraid we do need something like that, though we should update opal_Lib (or whatever its new name will be) since we moved from AC_SEARCH_LIBS to AC_CHECK_LIB. Makes sense ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggouaillardet I believe what @bwbarrett is saying is that we agreed not to run AC_CHECK_LIB at all if they provided a location, either in the ldflags option or in a non-default setting of the "with" option. Reason is that you'll get a false positive if the lib in the std location passes the test, even if the lib in the specified location fails. So the best you can do is check for existence in lib and lib64, and if found, then set LDFLAGS and move on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what @rhc54 said.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhc54 @bwbarrett let me put it this way.

If an end user configure --with-FOO=DIR and in the worst case scenario, you'd rather have configure success (and make might either crash because DIR/lib/libfoo.so is missing the required symbol, or silently use the libraries in the default location because DIR/lib/libfoo.so has the wrong bitness or the linker only accepts static library that is not there) than, silently use a working library in the default location instead or DIR/lib[64].

I fully understand there is not a perfect solution here.

If this is what we want, they I will make the changes, even if I believe this would be a regression (e.g. make fails and we could have trapped that in configure but we chose not to try).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you missed the point. Your configure logic is going to succeed anyway because AC_CHECK_LIB is going to do exactly what you just described - it is a compile test, and so the compiler is going to walk thru all those options and find a working library somewhere, but not necessarily the one you specified. Setting LDFLAGS does not limit the compiler to looking at just that directory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AC_CHECK_LIB is a link test. If --with-foo=DIR (or --with-foo-ldflags) found a library, that's the library it found. That library has the function that's important, in a way that's linkable (has the right size, bitness, static vs. dynamic, etc.). There's still a versioning problem and if callers are sensitive to a particular version, then they probably need extra tests in the higher level check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhc54 I fully understand AC_CHECK_LIB might lead to false positives if libfoo.so in DIR/lib is not usable. (And if it is usable, then this is the one that will be used, not the one in /usr/lib64).

13 days ago, you wrote

I thought we were going to revise the lib section to avoid use of AC_SEARCH_LIB and use AC_CHECK_LIB instead, and only after checking for existence of the specified lib in the target location?

and yesterday

I believe what @bwbarrett is saying is that we agreed not to run AC_CHECK_LIB at all if they provided a location, either in the ldflags option or in a non-default setting of the "with" option.

The only point in my previous comment was if we simply do not run AC_CHECK_LIB at all, then we not only keep the same false positives (and there is virtually nothing we can do about that), but we also miss some false negatives (e.g. configure successes, but make will fail).

And once again, if both of you (@rhc54 and @bwbarrett ) still agree Open MPI should not run AC_CHECK_LIB at all (when --with-FOO-ldflags=LDFLAGS or --with-FOO=DIR is used), then I will implement this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have given up - my head has been spun around on this so many times, I've lost count. You guys do whatever you want.

@ggouaillardet
Copy link
Contributor Author

@rhc54 do we really need to remove the trailing slash(es) ?

Developers should not add these in Open MPI, and so far, the policy was that end users get what they ask for. So if they configure --with-FOO=DIR////, they should/could/will get what they ask for.

@rhc54
Copy link
Contributor

rhc54 commented Mar 7, 2018

Yeah, we really do need to remove those trailing slash(es). Even some of our testers (e.g., IBM Jenkins in the PMIx repo) use them. So please do restore it.

@AboorvaDevarajan
Copy link
Member

Can one of the admins verify this patch?

@ibm-ompi
Copy link

ibm-ompi commented Mar 9, 2022

The IBM CI (PGI) build failed! Please review the log, linked below.

Gist: https://gist.github.com/6ff06c812a7c60275cda1d3030ec6ad3

@bwbarrett
Copy link
Member

I'm not sure why the IBM tester ran on this PR yesterday. I'm going to close this PR, because we're in the process of rewriting opal_check_package to use pkg-config and friends and this will become less and less relevant as we do that.

@bwbarrett bwbarrett closed this Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants