-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uninline trig functions. #24117
Uninline trig functions. #24117
Conversation
@nanosoldier |
Even if this is a slight performance decrease in microbenchmarks, I think we should do this. |
Don't disagree, just wanted to see the actual performance impact here. I'm not entirely sure how b=@benchmarkable sin($x)
tune!(b)
run(b) is supposed to be / not be affected by inlining at the |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
Okay, so some of this is certainly noise, but I do notice that there are 10 |
Is nanosoldier back? What about JuliaIO/JLD.jl#196 ? |
I have absolutely no idea why this worked and other PRs didn't (and continue not to). |
@pkofod since you mentioned: f(x)=sin(x)+cos(x) In general I think your right about not inlining (called outlining?), but for that case I see in my 0.5 (not everyone will know about the alternative sincos function or how to call it..): [I don't know too much about this, but since this is an important function can't we do better for f?]
I see you dropped inline from sincos and then I also see (could the above be compiled using this or similar?):
I'm not sure how I can call these, but can Julia be clever (without macros) and recognize that function (and similar ones, for e.g. power of two) and call such one sincos function?
|
sincos doesn't return |
The
|
I don't think so, |
Sorry, I mistook how |
Yeah, that's why I'm surprised at the regressions, but I think there's a chance they might just be noise, though it is 10/66 benchmarks. |
While we're waiting for @nanosoldier I tried to run the sincos benchmarks locally. Seems like there's around 10% regression here on average. Some exceptions: Below is a figure. On the left, it shows histograms of ratios of runtimes where the numerator is the noinline commit (this one) and the denominator is the commit just before it. Blue and red are just two different |
While we should probably try to be fast for |
Do you know why the |
Possibly? In that case there might not be much we can do about it. |
If someone (cc: @jrevels @ararslan ) who knows more about the benchmarking code can verify (or teach me how to verify) if nanosoldier constant folds the |
I don't think this is constant folding. If the function returns immediately after checking if the input argument is zero, then the cost of having to do the call will be large compared to if the function body is inlined. So it is not surprising that preventing inlining of this function will give regression for the special cases. I think this is a non-issue though, and that this should just be merged. |
Yes of course, the work to call time ratio will be very very low in this case (<0.5 apparently). |
The resulting code for functions that calls
sin
orcos
is increased substantially when inlined (thinkf(x)=sin(x)+cos(x)
). I didn't do it fortan
,atan
,atan2
either, so might just have been the remains of some trial-and-error learning about performance sensitive code on my part.Would appreciate if someone ran nanosoldier here.