-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Efficiency improvement of exp(::StridedMatrix) with UniformScaling and mul! #40668
Conversation
Could you try a larger matrix size for (b)? As the size you tested (n=50) it looks like there are no improvements. |
For case b, in a relative sense the difference is becoming smaller and smaller the bigger the matrix because the computation time is drowning in the matrix-matrix CPU-time. You can see slightly more if you "eliminate" the matrix matrix products. function exp_test_new(A,A2,A4,A6)
T=eltype(A);
CC = T[64764752532480000.,32382376266240000.,7771770303897600.,
1187353796428800., 129060195264000., 10559470521600.,
670442572800., 33522128640., 1323241920.,
40840800., 960960., 16380.,
182., 1.]
Ut = CC[4]*A2
Ut[diagind(Ut)] .+= CC[2]
U = ((CC[14].*A6 .+ CC[12].*A4 .+ CC[10].*A2) .+
CC[8].*A6 .+ CC[6].*A4 .+ Ut)
Vt = CC[3]*A2
Vt[diagind(Vt)] .+= CC[1]
V = (CC[13].*A6 .+ CC[11].*A4 .+ CC[9].*A2) .+
CC[7].*A6 .+ CC[5].*A4 .+ Vt
end
function exp_test_org(A,A2,A4,A6)
T=eltype(A);
CC = T[64764752532480000.,32382376266240000.,7771770303897600.,
1187353796428800., 129060195264000., 10559470521600.,
670442572800., 33522128640., 1323241920.,
40840800., 960960., 16380.,
182., 1.]
n=size(A,1);
Inn=Matrix{eltype(A)}(I,n,n);
U = ((CC[14].*A6 .+ CC[12].*A4 .+ CC[10].*A2) .+
CC[8].*A6 .+ CC[6].*A4 .+ CC[4].*A2 .+ CC[2].*Inn)
V = (CC[13].*A6 .+ CC[11].*A4 .+ CC[9].*A2) .+
CC[7].*A6 .+ CC[5].*A4 .+ CC[3].*A2 .+ CC[1].*Inn
end We can do timing like this:
New code seems better for this domain of Edit: After incorpating |
I think one can save one additional allocation in case b by replacing
and then precompute where |
BTW, you might get rid of the massive CI failure by rebasing onto current master. There is some SuiteSparse checksums stuff going on. I was seeing all those failures and immediately started searching for typos. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a nice improvement. Shall we merge?
Should I squash the commits? I would be happy to do it except my usual procedure with |
I would have squashed it when merging, so there is no need to do that. As for the many other commits, I do the following procedure: first pull the current master into your local master, then checkout your branch, then
That will hopefully make all the intermediate commits go away. |
Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
Thanks. 👍 It seems the addition of the other commits spammed other PRs. Apologies. 🙈 |
…d mul! (JuliaLang#40668) Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
…d mul! (JuliaLang#40668) Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
…d mul! (JuliaLang#40668) Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
…d mul! (JuliaLang#40668) Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
Avoids the use of
Inn=Matrix{T}(I,n,n)
. This reduces the number ofa) matrix-matrix products for
nA<=2.1
: commit d94b513b) zero additions for
nA>2.1
: commit e6860a4This PR also has some memory allocation improvements by using
mul!
.Two CPU-time illustrations
a) nA<2.1
Original vs new:
b) nA>2.1
Original vs new:
The most important improvement is case a) where the number of matrix-matrix products is reduced.
Solves JuliaLang/LinearAlgebra.jl#840.