-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More i686 testing issues #8812
Comments
The reason I'm opening this issue is because I'd like to see if there's something we can do to fix floating point behavior systematically. The floating point inaccuracy is causing all sorts of problems:
I'm not sure if I should make the equivalence check here an approximate equivalence check, or whether we really can get good floating point precision when compiling against |
Someone who knows what defaults LLVM uses should chime in here. Are any of these, especially |
The problem on non-SSE Architectures (such as i686) is not lack of The flag -ffloat-store prevents this excess precision. It also slows down |
I guess even i686 uses legacy 80-bit x87 floating point math then? If we can't come up with a systematic solution then we do need to draw the line somewhere regarding how old of a processor we can realistically support. |
Yes, i686 may not have SSE2 instructions. I would simply add the -ffloat-store for this architecture, as this is the accepted way to obtain reproducible math results there. Bonus points for auto-detecting SSE2, and using -fsse2-math (sp?) instead in this case. |
Worth trying. How many different places will we need it? Just openlibm, or across all deps?
We don't currently have any compile-time processor feature detection, do we? Keying this off of |
-ffloat-store should be necessary everywhere that may perform floating-point operations. So probably everywhere. It may also be necessary when calling LLVM to generate code. |
@eschnett Unfortunately, adding |
@staticfloat You would also need to ensure that Julia uses the equivalent of this flag when generating code by calling LLVM. I don't think there is a flag for this -- this is rather a code generation issue. It may even be necessary to modify Julia's code generator to store an immediately re-load every floating point value after performing a float-point operation. |
Most direct approach is usually to checkout a source copy of clang and see what it translates the flag to |
Unfortunately, |
Apparently clang removed this option some time in the past two years. Yes, LimitFloatPrecision seems to be for something else. I investigated a bit, and am quite surprised at how difficult this is. Apparently, C99 mandates that this excess precision can be removed by rounding, so that e.g.
may have too much precision, while
will not. GCC agrees with this, Clang doesn't -- and I also don't know whether this is Clang or LLVM's optimizer. One way out I found is described here: http://stackoverflow.com/questions/17663780/is-there-a-document-describing-how-clang-handles-excess-floating-point-precision. One can set the i387 control word to round to a particular precision, e.g. double or float. Setting it once at startup would be fine; however, one then has to choose between single and double precision. Thus one would need to do this before every floating point operation. Setting the control word is probably expensive. Another option would be to store and re-load floating point values to memory after each operation. This is probably cheaper than changing the i387 control word. If LLVM doesn't support this out of the box, then it may be necessary to implement this in Julia's code generator. This should be straightforward, but tedious. This would essentially implement the effects of -ffloat-store explicitly. Taking a step back: The older AMD processors you mention, do they support SSE2 intrinsics? If so, it would be much easier to force LLVM to use SSE2 for math instead of i387. |
No, the whole problem here is that the SSE2 instructions are not supported On Thu, Oct 30, 2014 at 7:05 PM, Erik Schnetter notifications@github.com
|
Agreed, it's not worth spending too much time on it. If there's a simple solution which requires little work, even at the cost of a very slow execution, then go for it, and ship a special i686 build with a big warning on start. People using old CPUs like that are not going to ask for speed anyway. |
I'd really like to get the
MARCH=i686
builds working, but there is definitely something funky going on with the floating point arithmetic. Many things that should be "precise" in our floating-point math are only approximate once you compile fori686
. I've got some fixes pushed to this branch (which may themselves be a little suspect), but in any case we should probably try to get these fixed, since we need to compile 32-bit binaries asi686
to allow for older AMD processors.The text was updated successfully, but these errors were encountered: