More i686 testing issues #8812

staticfloat · 2014-10-25T17:52:27Z

I'd really like to get the MARCH=i686 builds working, but there is definitely something funky going on with the floating point arithmetic. Many things that should be "precise" in our floating-point math are only approximate once you compile for i686. I've got some fixes pushed to this branch (which may themselves be a little suspect), but in any case we should probably try to get these fixed, since we need to compile 32-bit binaries as i686 to allow for older AMD processors.

The text was updated successfully, but these errors were encountered:

staticfloat · 2014-10-25T18:12:59Z

The reason I'm opening this issue is because I'd like to see if there's something we can do to fix floating point behavior systematically. The floating point inaccuracy is causing all sorts of problems:

julia> sind(30)
0.49999999999999994

julia> rationalize(Int32, -2.7, tol=0)
-27//10

julia> rationalize(Int32, -2.7, tol=0) == -2.7
false

I'm not sure if I should make the equivalence check here an approximate equivalence check, or whether we really can get good floating point precision when compiling against i686.

ViralBShah · 2014-10-25T18:31:10Z

Someone who knows what defaults LLVM uses should chime in here. Are any of these, especially fast-math enabled?

http://llvm.org/docs/LangRef.html#fastmath

eschnett · 2014-10-25T18:54:28Z

The problem on non-SSE Architectures (such as i686) is not lack of
precision, but excess precision. Thus is still inconvenient, since results
differ depending on where "too much" precision is used.

The flag -ffloat-store prevents this excess precision. It also slows down
execution a bit.

tkelman · 2014-10-26T16:13:42Z

I guess even i686 uses legacy 80-bit x87 floating point math then? If we can't come up with a systematic solution then we do need to draw the line somewhere regarding how old of a processor we can realistically support. pentium4 with SSE allows us to not worry about the x87 floating point issues, but would rule out at least 1 real user.

eschnett · 2014-10-27T14:07:18Z

Yes, i686 may not have SSE2 instructions. I would simply add the -ffloat-store for this architecture, as this is the accepted way to obtain reproducible math results there. Bonus points for auto-detecting SSE2, and using -fsse2-math (sp?) instead in this case.

tkelman · 2014-10-27T20:21:21Z

I would simply add the -ffloat-store for this architecture, as this is the accepted way to obtain reproducible math results there.

Worth trying. How many different places will we need it? Just openlibm, or across all deps?

Bonus points for auto-detecting SSE2, and using -fsse2-math (sp?) instead in this case.

We don't currently have any compile-time processor feature detection, do we? Keying this off of MARCH might be simplest.

eschnett · 2014-10-27T20:57:42Z

-ffloat-store should be necessary everywhere that may perform floating-point operations. So probably everywhere. It may also be necessary when calling LLVM to generate code.

staticfloat · 2014-10-30T07:19:54Z

@eschnett Unfortunately, adding -ffloat-store to the CFLAGS of LLVM, Julia, and openlibm doesn't work for me. I still get the same errors.

eschnett · 2014-10-30T13:40:11Z

@staticfloat You would also need to ensure that Julia uses the equivalent of this flag when generating code by calling LLVM. I don't think there is a flag for this -- this is rather a code generation issue. It may even be necessary to modify Julia's code generator to store an immediately re-load every floating point value after performing a float-point operation.

staticfloat · 2014-10-30T20:43:22Z

@vtjnash @Keno Do either of you know how I might go about doing this?

vtjnash · 2014-10-30T22:24:24Z

Most direct approach is usually to checkout a source copy of clang and see what it translates the flag to

staticfloat · 2014-10-30T23:00:00Z

Unfortunately, clang doesn't support -ffloat-store. I found the option -mlimit-float-precision, but that looks like it maps to LimitFloatPrecision in LLVM-land which I believe is meant for much lower precision floating point operations?

eschnett · 2014-10-31T02:05:04Z

Apparently clang removed this option some time in the past two years. Yes, LimitFloatPrecision seems to be for something else.

I investigated a bit, and am quite surprised at how difficult this is. Apparently, C99 mandates that this excess precision can be removed by rounding, so that e.g.

a = b+d+c;

may have too much precision, while

tmp = a+b;
a = tmp+c;

will not. GCC agrees with this, Clang doesn't -- and I also don't know whether this is Clang or LLVM's optimizer.

One way out I found is described here: http://stackoverflow.com/questions/17663780/is-there-a-document-describing-how-clang-handles-excess-floating-point-precision.

One can set the i387 control word to round to a particular precision, e.g. double or float. Setting it once at startup would be fine; however, one then has to choose between single and double precision. Thus one would need to do this before every floating point operation. Setting the control word is probably expensive.

Another option would be to store and re-load floating point values to memory after each operation. This is probably cheaper than changing the i387 control word. If LLVM doesn't support this out of the box, then it may be necessary to implement this in Julia's code generator. This should be straightforward, but tedious. This would essentially implement the effects of -ffloat-store explicitly.

Taking a step back: The older AMD processors you mention, do they support SSE2 intrinsics? If so, it would be much easier to force LLVM to use SSE2 for math instead of i387.

staticfloat · 2014-10-31T02:19:03Z

No, the whole problem here is that the SSE2 instructions are not supported
on that architecture. Reimplementing -ffloat-store is way too much work.
An LLVM option to set this would be acceptable, but I have no idea how to
do that. We shouldn't lose sleep over this, it's more of a completeness and
correctness thing than anything else. I doubt anyone truly cares about
this, since all remotely modern hardware has SSE2. I'll leave this open a
little longer, and then if we can't come up with a good solution, I'll just
provide an i686 build on demand.
-E

On Thu, Oct 30, 2014 at 7:05 PM, Erik Schnetter notifications@github.com
wrote:

Apparently clang removed this option some time in the past two years. Yes,
LimitFloatPrecision seems to be for something else.

I investigated a bit, and am quite surprised at how difficult this is.
Apparently, C99 mandates that this excess precision can be removed by
rounding, so that e.g.

a = b+d+c;

may have too much precision, while

tmp = a+b;
a = tmp+c;

will not. GCC agrees with this, Clang doesn't -- and I also don't know
whether this is Clang or LLVM's optimizer.

One way out I found is described here:
http://stackoverflow.com/questions/17663780/is-there-a-document-describing-how-clang-handles-excess-floating-point-precision
.

One can set the i387 control word to round to a particular precision, e.g.
double or float. Setting it once at startup would be fine; however, one
then has to choose between single and double precision. Thus one would need
to do this before every floating point operation. Setting the control word
is probably expensive.

Another option would be to store and re-load floating point values to
memory after each operation. This is probably cheaper than changing the
i387 control word. If LLVM doesn't support this out of the box, then it may
be necessary to implement this in Julia's code generator. This should be
straightforward, but tedious. This would essentially implement the effects
of -ffloat-store explicitly.

Taking a step back: The older AMD processors you mention, do they support
SSE2 intrinsics? If so, it would be much easier to force LLVM to use SSE2
for math instead of i387.

—
Reply to this email directly or view it on GitHub
#8812 (comment).

nalimilan · 2014-10-31T08:44:29Z

Agreed, it's not worth spending too much time on it.

If there's a simple solution which requires little work, even at the cost of a very slow execution, then go for it, and ship a special i686 build with a big warning on start. People using old CPUs like that are not going to ask for speed anyway.

staticfloat added the system:32-bit Affects only 32-bit systems label Oct 25, 2014

staticfloat mentioned this issue Oct 25, 2014

32-bit Complex number error #8708

Closed

staticfloat mentioned this issue Oct 30, 2014

Possibly more i686 issues #8731

Closed

staticfloat closed this as completed Nov 10, 2014

tkelman mentioned this issue Nov 18, 2014

Setting MARCH=i686 causes dates test to fail on 32 bit #9039

Closed

nalimilan mentioned this issue Apr 6, 2015

"mismatch of non-finite elements" error when running linalg tests on 32-bit #10749

Closed

rrahn mentioned this issue Jan 19, 2017

Fix intel compiler17 warnings seqan/seqan#2011

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More i686 testing issues #8812

More i686 testing issues #8812

staticfloat commented Oct 25, 2014

staticfloat commented Oct 25, 2014

ViralBShah commented Oct 25, 2014

eschnett commented Oct 25, 2014

tkelman commented Oct 26, 2014

eschnett commented Oct 27, 2014

tkelman commented Oct 27, 2014

eschnett commented Oct 27, 2014

staticfloat commented Oct 30, 2014

eschnett commented Oct 30, 2014

staticfloat commented Oct 30, 2014

vtjnash commented Oct 30, 2014

staticfloat commented Oct 30, 2014

eschnett commented Oct 31, 2014

staticfloat commented Oct 31, 2014

nalimilan commented Oct 31, 2014

More i686 testing issues #8812

More i686 testing issues #8812

Comments

staticfloat commented Oct 25, 2014

staticfloat commented Oct 25, 2014

ViralBShah commented Oct 25, 2014

eschnett commented Oct 25, 2014

tkelman commented Oct 26, 2014

eschnett commented Oct 27, 2014

tkelman commented Oct 27, 2014

eschnett commented Oct 27, 2014

staticfloat commented Oct 30, 2014

eschnett commented Oct 30, 2014

staticfloat commented Oct 30, 2014

vtjnash commented Oct 30, 2014

staticfloat commented Oct 30, 2014

eschnett commented Oct 31, 2014

staticfloat commented Oct 31, 2014

nalimilan commented Oct 31, 2014