-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correlation sparse array is very slow #7788
Comments
Previous discussion: https://groups.google.com/forum/?fromgroups=#!topic/julia-users/VacAk16jiN0 |
Also very slow running sparse arrays: mean (S, 1), cov, etc functions through the column. |
We will fix all of these as soon as 0.3 is released. Thanks for reporting. |
Correlations and cov are calculated wonderfully at dense matrices, all 8 W dniu 2014-08-01 14:45, Viral B. Shah pisze:
|
When this can work? When this can work? I=int32((rand(10^7)_9999999).+1); |
I suspect that just implementing the |
Big thx, it is short way;) |
@lindahua can you help here? |
I think one may just need to implement At_mul_B and friends |
It is also very slow when we do sparse mean etc ... |
reduction along dimensions has not been specially optimized for sparse matrix. That should not be too difficult though, since we only have to consider matrix here, rather than arrays of arbitrary dimensions. |
Dear, what about sparse statistic? julia> k,l=size(D) julia> E=zeros(l)'; julia> @time for i=1:l Propably is simmilary in cor, var,cov, etc Paul |
Please, let use the parallel in this functions. |
@paulanalyst As usual: what is your julia> A = sprandn(6000000,30000,0.00003);
julia> @time mean(A, 1);
elapsed time: 0.13834035 seconds (166 MB allocated, 9.94% gc time in 7 pauses with 0 full sweep) |
julia> versioninfo() julia> nnz(D) |
I can confirm that on 0.3.5 @andreasnoack's example takes ages. |
Yes. A faster mean for sparse matrices was introduced in bde4e65 so it is not available on 0.3.x. However, the implementation is very simple, i.e. |
OK, at 0.4.0 mean is fast, like below. But for "var" I am waitng many minetes... Version 0.4.0-dev+2438 (2015-01-03 12:36 UTC) julia> @time E=mean(D, 1) julia> @time E=var(D, 1) |
sum(D,1)/size(D,1) at 0.3.5 is 2 times longer then mean(D,1) at 0.4.0. , 11 ver 6 sek in my array. julia> @time sum(D,1)/size(D,1) |
I think that the reason The |
We should backport the mean. |
@ andreasnoack I use X'X-E(X)'*E(X) |
Then you won't get the right result. Anyway, I'm talking about the method in the issue where you thought the was a rounding error, but it was wrong formula. |
(cherry picked from commit bde4e65)
#10536 should have fixed this one. |
Unfortunately it doesn't look like it does, since |
Correlation sparse array is very slow. Out of memory on a dense array when we have 30,000 columns. How quickly it calculated?
Paul
The text was updated successfully, but these errors were encountered: