Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sprand and SparseMatrixCSC constructor simplifications #30617

Merged
merged 6 commits into from
Jan 22, 2019

Conversation

abraunst
Copy link
Contributor

@abraunst abraunst commented Jan 6, 2019

This PR implements some simplifications to the sparse code made very easy by #30494. In there, sparse random matrices were effectively constructed from a vector of linear indices (which for sprand were taken from randsubseq). So I factored out the sparse matrix construction part from sprand that can be used now both in sprand and in the fallback sparse constructor from AbstractMatrix (which does not catch e.g. Matrix because there is a more specialized constructor from StridedMatrix). I tested that there is no performance regression by checking against Traspose. In fact there seems to be a (small) performance gain as (a) there is no need to check for duplicates and (b) using linear indices one avoids the allocation of an extra vector of indices. Is there a way to check for performance regressions more systematically?
(net outcome from this PR is -60 lines)

for n=(10,100,1000), p=(1.0,1e-1,1e-2,1e-3,1e-4)
    @show(n,p)
    Random.seed!(1); x=Transpose(Matrix(sprand(n,n,p))); @show @benchmark sparse($x)
end

MASTER:

n\p 1.0 0.1 0.01 0.001 0.0001
n=10 1.481 μs 517.677 ns 379.346 ns 350.789 ns 353.089 ns
n=100 121.878 μs 37.005 μs 20.916 μs 18.598 μs 18.476 μs
n=1000 20.618 ms 6.356 ms 3.295 ms 2.892 ms 2.936 ms

PR:

n\p 1.0 0.1 0.01 0.001 0.0001
n=10 1.357 μs 445.061 ns 332.215 ns 305.885 ns 307.341 ns
n=100 113.133 μs 37.731 μs 20.636 μs 18.667 μs 18.342 μs
n=1000 18.671 ms 6.013 ms 3.290 ms 2.924 ms 2.939 ms

@ViralBShah
Copy link
Member

I don't believe that the nanosoldier perf benchmarks have anything relevant here.

@ViralBShah ViralBShah added the sparse Sparse arrays label Jan 6, 2019
@ViralBShah ViralBShah changed the title Sparse code simplifications sprand code simplifications Jan 7, 2019
@abraunst
Copy link
Contributor Author

abraunst commented Jan 7, 2019

@ViralBShah I am ok with the title change, but note that this is not exclusively about sprand. It introduces a contructor of SparseMatrixCSC from a linear increasing set of indices, and uses that on sprand and sparse.

@ViralBShah
Copy link
Member

I did mean to update the title to reflect that. Will do shortly.

@ViralBShah ViralBShah changed the title sprand code simplifications sprand and SparseMatrixCSC constructor simplifications Jan 7, 2019
@stevengj
Copy link
Member

stevengj commented Jan 7, 2019

With this new code, can we get rid of the specialized StridedMatrix constructor, or does the latter still have a substantial performance advantage?

@abraunst
Copy link
Contributor Author

abraunst commented Jan 7, 2019

With this new code, can we get rid of the specialized StridedMatrix constructor, or does the latter still have a substantial performance advantage?

Unfortunately it is still faster:

julia> using SparseArrays, BenchmarkTools

julia> x=Matrix(sprand(1000,1000,0.1));

master:

julia> @benchmark sparse($x)
BenchmarkTools.Trial: 
  memory estimate:  1.53 MiB
  allocs estimate:  6
  --------------
  minimum time:     1.534 ms (0.00% GC)
  median time:      1.587 ms (0.00% GC)
  mean time:        1.613 ms (1.24% GC)
  maximum time:     27.612 ms (94.07% GC)
  --------------
  samples:          3091
  evals/sample:     1

PR:

julia> @benchmark sparse($x)
BenchmarkTools.Trial: 
  memory estimate:  4.01 MiB
  allocs estimate:  36
  --------------
  minimum time:     2.333 ms (0.00% GC)
  median time:      2.386 ms (0.00% GC)
  mean time:        2.454 ms (2.57% GC)
  maximum time:     31.432 ms (92.45% GC)
  --------------
  samples:          2036
  evals/sample:     1

Noting the extra allocation, I tried the following:

function SparseMatrixCSC{Tv,Ti}(M::AbstractMatrix) where {Tv,Ti}
    @assert !has_offset_axes(M)
    nz = count(!iszero, M)
    I = Vector{Ti}(undef, nz)
    V = Vector{Tv}(undef, nz)
    i, j = 0, 0

    for v in M
        i += 1
        if !iszero(v)
            I[j += 1] = i
            V[j] = v
        end
    end
    return sparse_sortedlinearindices!(I, V, size(M)...)
end

And we get much closer! Will try to benchmark better tomorrow.

julia> @benchmark sparse($x)
BenchmarkTools.Trial: 
  memory estimate:  1.54 MiB
  allocs estimate:  6
  --------------
  minimum time:     1.615 ms (0.00% GC)
  median time:      1.664 ms (0.00% GC)
  mean time:        1.703 ms (1.72% GC)
  maximum time:     30.738 ms (94.42% GC)
  --------------
  samples:          2930
  evals/sample:     1

@stevengj
Copy link
Member

stevengj commented Jan 8, 2019

This is probably fast enough to get rid of the specialized Matrix method. Performance-critical code will probably never use these methods anyway—you always make large sparse matrices directly, not from dense matrices. These conversions are just here for convenience for small interactive problems, I think.

@abraunst
Copy link
Contributor Author

abraunst commented Jan 8, 2019

This is probably fast enough to get rid of the specialized Matrix method. Performance-critical code will probably never use these methods anyway—you always make large sparse matrices directly, not from dense matrices. These conversions are just here for convenience for small interactive problems, I think.

The situation is not clear-cut. Maybe it would be wiser to leave the PR and leave the M::StridedArrays constructor (i.e. PR as is)?

The above approach is faster for very dense matrices but substantially slower in sparser ones in the Transpose test:

master (AbstractArray constructor):

julia> [ begin Random.seed!(1); x=Transpose(Matrix(sprand(n,n,p)));  b=@benchmark sparse($x); minimum(b.times); end for n=(10,100,1000), p=(1.0,1e-1,1e-2,1e-3,1e-4)]
3×5 Array{Float64,2}:
   1461.6          486.513        366.928        348.009        347.897    
 121178.0        38327.0        20727.0        18619.0        18414.0      
      2.09966e7      6.31002e6      3.36006e6      2.96888e6      2.94113e6

above proposal:

julia> [ begin Random.seed!(1); x=Transpose(Matrix(sprand(n,n,p)));  b=@benchmark sparse($x); minimum(b.times); end for n=(10,100,1000), p=(1.0,1e-1,1e-2,1e-3,1e-4)]
3×5 Array{Float64,2}:
   470.23         393.356        392.965        382.365        381.788    
 38097.0        40642.0        32357.0        31666.0        31449.0      
     6.36208e6      6.15465e6      5.11586e6      5.11357e6      5.00486e6

PR

julia> [ begin Random.seed!(1); x=Transpose(Matrix(sprand(n,n,p)));  b=@benchmark sparse($x); minimum(b.times); end for n=(10,100,1000), p=(1.0,1e-1,1e-2,1e-3,1e-4)]
3×5 Array{Float64,2}:
   1262.9          386.683        273.415        236.594        235.64     
 113408.0        38788.0        19654.0        17685.0        17404.0      
      1.70226e7      5.28324e6      3.07331e6      2.79147e6      2.78202e6

in the Matrix test instead it is the opposite:

master (M::StridedArray constructor)

julia> [ begin Random.seed!(1); x=Matrix(sprand(n,n,p));  b=@benchmark sparse($x); minimum(b.times); end for n=(10,100,1000), p=(1.0,1e-1,1e-2,1e-3,1e-4)]
3×5 Array{Float64,2}:
   188.217        133.582        128.394     138.25     138.061
 10362.0        11663.0         7455.0      6531.4     6473.6  
     1.68884e6      1.5584e6  799494.0    721780.0   705629.0  

above proposal:

julia> [ begin Random.seed!(1); x=Matrix(sprand(n,n,p));  b=@benchmark sparse($x); minimum(b.times); end for n=(10,100,1000), p=(1.0,1e-1,1e-2,1e-3,1e-4)]
3×5 Array{Float64,2}:
   246.682        139.053        131.625     132.598     130.24
 15989.0        13112.0         6817.2      5970.83     5806.5 
     2.29853e6      1.6286e6  799516.0    714622.0    696401.0 

PR

julia> [ begin Random.seed!(1); x=Matrix(sprand(n,n,p));  b=@benchmark sparse($x); minimum(b.times); end for n=(10,100,1000), p=(1.0,1e-1,1e-2,1e-3,1e-4)]
3×5 Array{Float64,2}:
  1112.3          271.748         155.67     120.776     120.347
 95653.0        22313.0          7301.25    5423.17     5190.67 
     9.82539e6      2.33503e6  729518.0   540351.0    520473.0  

@abraunst
Copy link
Contributor Author

abraunst commented Jan 8, 2019

The failed check is a timeout:

The job exceeded the maximum time limit for jobs, and has been terminated.

@abraunst
Copy link
Contributor Author

abraunst commented Jan 8, 2019

BTW how would people feel about exporting sparse(I::Vector{Int}, V::Vector, m::Int, n::Int) = sparse_linearindices!(copy(I), V, m, n) ?

EDIT: added full qualifications and fixed a mistake

@ViralBShah
Copy link
Member

I've restarted the failing build, but shouldn't really matter since we have enough greens.

@abraunst
Copy link
Contributor Author

abraunst commented Jan 9, 2019

The situation is not clear-cut. Maybe it would be wiser to leave the PR and leave the M::StridedArrays constructor (i.e. PR as is)?

Just to recap: with the PR as-is, we get "PR" for the first set, and "master" for the second set (because we would keep the existent constructor from StridedArray). We can choose any combination of the two essentially, but this choice is the only one that is uniformly better than master (in these tests). As @stevengj said, this is probably not performance-critical code in any case.

Besides the simplification and performance improvement, the PR fixes a bug in #30627 (type instability and wrong type return for non 0x0 matrices). Let me know how to proceed!

EDIT: rebased.

@KlausC
Copy link
Contributor

KlausC commented Jan 15, 2019

These conversions are just here for convenience for small interactive problems, I think.

The SparseMatrixCSC(M::AbstractMatrix) constructors are also used as the "abstractarray fallback", when you transform for example Symmetric{Tv,SparseMatrixCSC}. There is a PR 30552 under review, which improves the perfomance for those cases essentially.

@abraunst
Copy link
Contributor Author

bump?

@ViralBShah
Copy link
Member

@stevengj Are you ok with this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants