-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement NVVMReflect in Julia. #280
Conversation
Weirdly, no change at the PTX level. Could it be that the pass was being run by the back-end after all? It's probably good to do so earlier, and give LLVM more optimization opportunities. |
448f9eb
to
f4311bf
Compare
Was trying this out.
|
LLVM doesn't handle it either, but is more conservative about failing: https://github.com/JuliaLang/llvm-project/blob/bc5644ee74f4cb42042257ac129d2be1c252e3f2/llvm/lib/Target/NVPTX/NVVMReflect.cpp#L163-L173 |
f4311bf
to
058b615
Compare
OK, I added a couple more and demoted the error to a warning. Curiously, LLVM currently defaults to 0 for an unsupported flag. That means it is using --prec-div=false and --prec-sqrt=false, both of which NVCC puts under the --use_fast_math=true flag. So we were using some fastmath-y mode by default.... |
Also had to dig this out recently https://llvm.org/docs/CompileCudaWithLLVM.html#flags-that-control-numerical-code One of the biggest divergences to Clang is in our choice of not using |
# handle possible cases | ||
# XXX: put some of these property in the compiler job? | ||
# and/or first set the "nvvm-reflect-*" module flag like Clang does? | ||
fast_math = Base.JLOptions().fast_math == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this part of @cuda
would be cool, but this is already a strict improvement!
058b615
to
e86fe9d
Compare
This should be good to go. I don't want to put too much effort in this pass, because I'm still hoping we can just use the LLVM back-end for this in the future (in that case we'll have to set the appropriate module flags so that LLVM can act in accordance to Julia's fastmath settings). |
Codecov Report
@@ Coverage Diff @@
## master #280 +/- ##
==========================================
- Coverage 86.75% 86.12% -0.63%
==========================================
Files 22 22
Lines 2016 2062 +46
==========================================
+ Hits 1749 1776 +27
- Misses 267 286 +19
Continue to review full report at Codecov.
|
Before:
After:
So 250 -> 120 lines, pretty significant.