Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable and fix unit tests in different archs #183

Merged
merged 3 commits into from
May 9, 2020

Conversation

jlblancoc
Copy link
Member

@jlblancoc jlblancoc commented Dec 7, 2019

This PR does:


This change is Reviewable

# Check for optional compiler flags, if supported:
include(CheckCXXCompilerFlag)
# See https://github.com/borglab/gtsam/issues/21
check_cxx_compiler_flag(-faligned-new COMPILER_HAS_FALIGNED_NEW)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the minimum compiler versions of gcc, clang, and VS which support this? If it's only very new compilers, I think we should still prefer to address Eigen alignment issues by appropriate placement of EIGEN_MAKE_ALIGNED_OPERATOR_NEW so we don't abandon users who have no control over the compiler version in their build environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Please note GTSAM is used by many companies on a variety of platforms, including embedded ones where compiler versions might be lagging. Incidentally, This is also why C++17 and std:shared_ptr are not suited for 4.X versions...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I have removed this commit from the PR, since it was finally solved in a better way in #185

@dellaert
Copy link
Member

dellaert commented Dec 7, 2019

Many thanks BTW! I’ll wait to approve before the discussion above is resolved, but this allows GTSAM to build on more platforms, so awesome!

@jlblancoc jlblancoc force-pushed the fix/unit-test-tolerances branch 2 times, most recently from bf80e22 to c5c031b Compare December 13, 2019 07:07
@ProfFan ProfFan mentioned this pull request Dec 22, 2019
Copy link
Member

@dellaert dellaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is ready for merging @jlblancoc @chrisbeall ??

@ProfFan
Copy link
Collaborator

ProfFan commented Dec 24, 2019

@dellaert I am not that comfortable with lowering tolerances that much... As in #168, I think the best way is to diagnose the problem in the numerical solutions, especially those implemented with a * b, where a can be very small so the result goes to zero...

@ProfFan
Copy link
Collaborator

ProfFan commented Dec 24, 2019

For example, in the SO4 PoC here: https://godbolt.org/z/vdwsbj

Even we define FLOAT_TYPE to float,

==== TESTING a!=0 && b == 0 ====
a = 0.0000001192,   c2 = 0.0000000000,   c3 = 0.0000000000
    0.0000001192, c2_s = 0.5000000000, c3_s = 0.1666666716
a = 0.0000011921,   c2 = 0.0000000000,   c3 = 0.0000000000
    0.0000011921, c2_s = 0.4999999404, c3_s = 0.1666666716
a = 0.0000119209,   c2 = 0.0000000000,   c3 = 0.0000000000
    0.0000119209, c2_s = 0.4999994934, c3_s = 0.1666666716
a = 0.0001192093,   c2 = 0.0000000000,   c3 = 0.0000000000
    0.0001192093, c2_s = 0.4999950230, c3_s = 0.1666666716
a = 0.0011920929,   c2 = 0.5033164620,   c3 = 0.1374389529
    0.0011920929, c2_s = 0.4999503195, c3_s = 0.1666666567
a = 0.0119209290,   c2 = 0.4999610484,   c3 = 0.1665760130
    0.0119209290, c2_s = 0.4995032847, c3_s = 0.1666654795
a = 0.1192092896,   c2 = 0.4994073808,   c3 = 0.1665496230
    0.1192092896, c2_s = 0.4950332344, c3_s = 0.1665482819
a = 1.1920928955,   c2 = 0.4435229003,   c3 = 0.1552171558
    1.1920928955, c2_s = 0.4530631304, c3_s = 0.1552170664

The result with Taylor series is still accurate to 10^(-7)

@ProfFan ProfFan closed this Dec 24, 2019
@ProfFan ProfFan reopened this Dec 24, 2019
@ProfFan
Copy link
Collaborator

ProfFan commented Dec 24, 2019

(misclicked on close...)

@jlblancoc
Copy link
Member Author

Any advise on how to deal with this PR? I don't actually have much more time to address a more detailed analysis on optimal unit test thresholds... ;-)
I can't remember why I changed most 1e-9 to 1e-5 but it was probably because 1e-8 or 1e-7 was still not coarse enough for some of the architectures to pass the tests.

@ProfFan
Copy link
Collaborator

ProfFan commented Jan 20, 2020

@jlblancoc Have you looked at the PoC I posted? I think some routines in GTSAM are actually not numerically optimal, so we can definitely first test on 32-bit architectures like i386 and arm and see what unit tests fail, and trace the code path to improve the algorithms. It is on my list for quite a while but I haven't got time...

@varunagrawal varunagrawal added bugfix Fixes an issue or bug enhancement Improvement to GTSAM wip This is still a work in progress labels Feb 23, 2020
@dellaert
Copy link
Member

dellaert commented Apr 4, 2020

@ProfFan if your last comment is a "nice to have" then add an issue and merge this PR? If it is a blocker please refresh us on what the way to proceed is.

@ProfFan ProfFan mentioned this pull request Apr 5, 2020
@ProfFan
Copy link
Collaborator

ProfFan commented Apr 27, 2020

@jlblancoc My apologies for leaving this for a long time! My two cents on this PR are:

  1. Merge this PR so we don't block the packaging
  2. Add a macro for the precision (#define GTSAM_UNITTEST_TOLERANCE 1e-5) in the future

@jlblancoc jlblancoc force-pushed the fix/unit-test-tolerances branch from c5c031b to c1513d0 Compare April 28, 2020 06:53
@jlblancoc
Copy link
Member Author

Hi,
On a second thought, I found most unit tests used 1e-9 or 1e-5 as thresholds. So, I just added them as two CMake-configurable preprocessor variables: GTSAM_UNITTEST_TOLERANCE_TIGHT and GTSAM_UNITTEST_TOLERANCE_LOOSE.

Will wait until Travis goes over this last commit before merging.

@jlblancoc jlblancoc force-pushed the fix/unit-test-tolerances branch from c1513d0 to 80ccc60 Compare April 28, 2020 07:08
@jlblancoc jlblancoc changed the title [WIP] Enable and fix unit tests in different archs Enable and fix unit tests in different archs Apr 28, 2020
@jlblancoc jlblancoc removed the wip This is still a work in progress label Apr 28, 2020
@ProfFan ProfFan requested a review from dellaert April 28, 2020 13:50
Copy link
Member

@dellaert dellaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hate to be a pain in the butt, but I don't understand what the use is of those two defines. They make the code uglier (IMHO) and I'm not getting under what circumstances anyone would change them. I don't see any logic to set them differently for different platforms, but maybe I'm missing it? I'm just afraid we're adding complexity that will never be used by anyone - but I'm happy to be convinced otherwise.

@ProfFan
Copy link
Collaborator

ProfFan commented Apr 28, 2020

@dellaert On second thoughts, I think the core cause of this is that some of the exotic architectures supported by Debian lacks proper float operations? Because I think double is an IEEE standard and should always be 64 bit wide...

@jlblancoc
Copy link
Member Author

Hate to be a pain in the butt, but I don't understand what the use is of those two defines. They make the code uglier (IMHO) and I'm not getting under what circumstances anyone would change them. I don't see any logic to set them differently for different platforms, but maybe I'm missing it? I'm just afraid we're adding complexity that will never be used by anyone - but I'm happy to be convinced otherwise.

Hmm... it's totally true that the macros add one "level of indirection" to the reader.

They might have a potential utility, though, although the gain may be marginal so I don't have a strong point defending them: this PR started when I detected tests that passed in x64, but failed in other archs like x32 (see #168 and 54b739f ). In those cases, it would have been faster to change the value in one place instead of changing it in many test points.
OTOH... this is something that we won't be doing every day :-)

So, that was my motivation, but let me know what's your preference. It's 99% a style thing, not a big deal.

@dellaert
Copy link
Member

Hate to be a pain in the butt, but I don't understand what the use is of those two defines. They make the code uglier (IMHO) and I'm not getting under what circumstances anyone would change them. I don't see any logic to set them differently for different platforms, but maybe I'm missing it? I'm just afraid we're adding complexity that will never be used by anyone - but I'm happy to be convinced otherwise.

Hmm... it's totally true that the macros add one "level of indirection" to the reader.

They might have a potential utility, though, although the gain may be marginal so I don't have a strong point defending them: this PR started when I detected tests that passed in x64, but failed in other archs like x32 (see #168 and 54b739f ). In those cases, it would have been faster to change the value in one place instead of changing it in many test points.
OTOH... this is something that we won't be doing every day :-)

So, that was my motivation, but let me know what's your preference. It's 99% a style thing, not a big deal.

I'd prefer we did not do it. I'd even go the opposite way: wherever there isa 1e-9, we remove it, as it is the default value for the parameter. Then, if there is a different value, we suspect there was a reason for it.

@jlblancoc jlblancoc force-pushed the fix/unit-test-tolerances branch from 202c754 to 2622acb Compare April 29, 2020 06:09
@jlblancoc jlblancoc requested a review from dellaert April 29, 2020 06:17
@jlblancoc
Copy link
Member Author

@dellaert : rolled back to good old plain numbers ;-)

Approve and merge at your convenience.

# dh_make generated override targets
# This is example for Cmake (See https://bugs.debian.org/641051 )
override_dh_auto_configure:
dh_auto_configure -- -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_INSTALL_PREFIX=/usr -DGTSAM_BUILD_EXAMPLES_ALWAYS=OFF -DGTSAM_BUILD_TESTS=OFF -DGTSAM_BUILD_WRAP=OFF -DGTSAM_BUILD_DOCS=OFF -DGTSAM_INSTALL_CPPUNITLITE=OFF -DGTSAM_INSTALL_GEOGRAPHICLIB=OFF -DGTSAM_BUILD_TYPE_POSTFIXES=OFF -DGTSAM_BUILD_WITH_MARCH_NATIVE=OFF
dh_auto_configure -- -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_INSTALL_PREFIX=/usr -DGTSAM_BUILD_EXAMPLES_ALWAYS=OFF -DGTSAM_BUILD_TESTS=ON -DGTSAM_BUILD_WRAP=OFF -DGTSAM_BUILD_DOCS=OFF -DGTSAM_INSTALL_CPPUNITLITE=OFF -DGTSAM_INSTALL_GEOGRAPHICLIB=OFF -DGTSAM_BUILD_TYPE_POSTFIXES=OFF -DGTSAM_BUILD_WITH_MARCH_NATIVE=OFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

behavior change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is one of the main reasons of this PR: to run unit tests on the PPA server to have extra feedback on other architectures, additional to Travis ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: We didn't enable tests from the beginning due to some errors, if I recall correctly, but this should be the preferred state

@dellaert dellaert merged commit f948fe3 into develop May 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes an issue or bug enhancement Improvement to GTSAM
Projects
None yet
5 participants