-
-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization with 'fast math' produces incorrect results and segfaults #1155
Comments
Based on my understanding, yes. But as IEEE 754 is not enforced (which is where performance gains come from), this may make for some 'fun' refactoring. I mainly created this issue to make sure that users are aware of the limitation. It's somewhat puzzling that |
I'm not asking you to do this, but if anyone decides to pursue this issue, it'd be nice to have concrete benchmarks that adding these flags does improve real-world performance that justifies the (presumably) increased code complexity to handle this case. |
no worries. I’ll pass on this one 😜 |
I investigated this a bit, using Clang++ on Ubuntu 20.04. The only flag that's part of
Where the first two at least are just an issue of Enabling
Given the unsatisfactory nature of the changes required to support using |
Thanks for looking into this further, @speth! The two non-python tests look familiar, and I agree that As an aside, one thing I noticed in my own tests was a plethora of warnings coming out of Overall, I agree that it probably doesn't make sense to make this a priority. At the same time, it probably also makes sense to avoid using |
I don't see any warnings related to I would argue that most of the ways that we use NaN in Cantera are very reasonable, and that the alternatives are often worse. For instance, in the modification I made to the |
That sounds plausible.
I tend to agree: using |
@speth @bryanwweber @ischoegl Although the issue has been closed, I am reporting some additional data: I ran a small test program benchmarking the computation of reaction rates with 16 different compilers/versions, each with and without |
@g3bk47 … thanks for those results! Your comparison suite is impressive. From my perspective, a performance gain of up to 15% would be worth pursuing. While segfaults and erroneous results are fixed here (I.e. those issues are addressed), I believe this warrants an enhancement request? |
Problem description
As reported in #1150, compilation with default optimization options for the Intel compiler suite
icx
/icpx
results in incorrect results. The behavior can be reproduced forgcc
with the-ffinite-math-only
option (which is one of the-ffast-math
flags), e.g. for a vanillagcc
toolchain on Ubuntu 20.04:The option breaks strict IEEE compliance, as it does "Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs." There are some instances where Cantera uses
NaN
internally which presumably breaks 'fast math' optimizations.Steps to reproduce
gcc
with fast math (same as foricx
in Optimized Cantera build with the LLVM-based Intel compilers do not work correctly #1150) ...Behavior
System information
gcc
/icx
Additional context
{fmt}
output, which currently produces numerous warnings related toNaN
compliance.NaN
can be implemented, see SOThe text was updated successfully, but these errors were encountered: