Loosen SparseMatrixCSC colptr and rowval integer type restriction #31661

Pbellive · 2019-04-09T21:08:15Z

This fixes #31435. Please see that issue for full explanation. @KlausC any objection to this change?

…restrictive. Integer type needs to be wide enough to store matrix with number of non-zeros given, not any sparse matrix of equal size.

ViralBShah · 2019-04-10T03:10:51Z

I noted a potential objection in #31435.

KlausC

should be: nnzval < typemax(Ti) for correct handling of corner case Ti = Int; nnzval = typemax(Ti).

error text "Number .." should be lowercase for consistency with other messages.

fredrikekre · 2019-04-10T09:11:22Z

stdlib/SparseArrays/src/sparsematrix.jl

@@ -42,7 +42,7 @@ function sparse_check_Ti(m::Integer, n::Integer, Ti::Type)
        n < 0 && throwsz("columns", 'n', n)
        !isbitstype(Ti) || m ≤ typemax(Ti) || throwTi("number of rows", "m", m)
        !isbitstype(Ti) || n ≤ typemax(Ti) || throwTi("number of columns", "n", n)
-        !isbitstype(Ti) || n*m+1 ≤ typemax(Ti) || throwTi("maximal nnz+1", "m*n+1", n*m+1)
+        !isbitstype(Ti) || nnzval+1 ≤ typemax(Ti) || throwTi("Number of nonzeros", "nnz+1", nnzval+1)


Suggested change

!isbitstype(Ti) || nnzval+1 ≤ typemax(Ti) || throwTi("Number of nonzeros", "nnz+1", nnzval+1)

!isbitstype(Ti) || nnzval < typemax(Ti) || throwTi("number of nonzeros", "nnz+1", nnzval+1)

KlausC · 2019-04-10T09:36:15Z

As nnz(A) may change during the lifetime of A, it would be wise to assert that the value we want to store in A.colptr[A.n+1] (that is the correct new nnz(A)+1) fits in a Ti.

I agree that the proposed check of length(A.nzval) covers also the case, #31024, which was fixed by #31118. But why not check also length(A.rowval)? The idea behind #31118 was to check only immutable values in the constructor.

I would be happy, if we could assure full consistency of the data during the whole lifetime of A. That includes:

0 < A.colptr[k] <= A.colptr[k+1] for k = 1:A.n (that would cover Bound violation in SparseArrays - segfault #31024 also)
min(length(A.rowval), length(A.nzval)) >= nnz(A)

KlausC · 2019-04-10T11:04:27Z

stdlib/SparseArrays/src/sparsematrix.jl

@@ -23,7 +23,7 @@ struct SparseMatrixCSC{Tv,Ti<:Integer} <: AbstractSparseMatrix{Tv,Ti}
    function SparseMatrixCSC{Tv,Ti}(m::Integer, n::Integer, colptr::Vector{Ti}, rowval::Vector{Ti},
                                    nzval::Vector{Tv}) where {Tv,Ti<:Integer}

-        sparse_check_Ti(m, n, Ti)
+        sparse_check_Ti(m, n, length(nzval), Ti)


Suggested change

sparse_check_Ti(m, n, length(nzval), Ti)

sparse_check_Ti(m, n, max(length(nzval), length(rowval)), Ti)

KlausC · 2019-04-10T11:23:55Z

stdlib/SparseArrays/src/sparsematrix.jl

@@ -594,7 +594,7 @@ function sparse!(I::AbstractVector{Ti}, J::AbstractVector{Ti},
        csccolptr::Vector{Ti}, cscrowval::Vector{Ti}, cscnzval::Vector{Tv}) where {Tv,Ti<:Integer}

    require_one_based_indexing(I, J, V)
-    sparse_check_Ti(m, n, Ti)
+    sparse_check_Ti(m, n, length(V), Ti)


Suggested change

sparse_check_Ti(m, n, length(V), Ti)

sparse_check_Ti(m, n, length(I), Ti)

ViralBShah · 2019-04-10T12:39:17Z

Can't setindex! potentially grow the size of nzval and rowval, thus requiring the check to be the way it is right now (but being careful about the overflow)?

mbauman · 2019-04-10T14:33:11Z

Yes, but:

That's not how things behaved prior to Check if Ti is sufficient for size of SparseMatrixCSC #31118
The change in Check if Ti is sufficient for size of SparseMatrixCSC #31118 was breaking
We merged Check if Ti is sufficient for size of SparseMatrixCSC #31118 without consideration of the breaking-ness

So these sorts of "it'd be nice to guarantee"s are non-sequitors. We just need to get things back to the old status quo and then we can discuss how to approach the next steps.

It's either merge this patch or wholesale revert #31118.

KristofferC · 2019-04-10T14:45:49Z

The discussion here shows that there are intricacies we don't really have time to deal with now, considering we want to branch for 1.3 about now. I suggest just reverting the breaking PR (#31667) and then the discussion can take as long as it wants to add it back in a non-breaking way.

KlausC · 2019-04-10T15:13:50Z

Sorry, but without a replacement for #31118 the software is breaking because of #31024. Propose to merge this PR with the proposed changes.

KristofferC · 2019-04-10T15:20:49Z

the software is breaking because of

What code would reverting the PR break? I see an error being replaced with another one.

mbauman · 2019-04-10T15:46:54Z

Note that we're using breaking with a very specific meaning here: it's not the Julia is "broken" (e.g., with a seg fault in some instances) but rather that a particular change caused a regression that broke someone's workflow that previously worked. That's significantly worse than an existing bug that has been a bug for a long time.

Pbellive · 2019-04-10T18:12:31Z

I'm in favour of @KristofferC 's suggestion to revert #31118 and continue discussion of a long term fix here. @ViralBShah and @KlausC 's comments that setindex! can grow the internal arrays are of course important points. To address that, it seems like it would be sufficient to just add a type check inside the setindex! methods to make sure that A.colptr[n+1] < typemax(Ti) remains true after the new entries are added to the matrix. I haven't tested that but I wouldn't think it would be much of a performance hit.

mauro3 · 2019-04-10T19:21:20Z

Sorry, but without a replacement for #31118 the software is breaking because of #31024

@KlausC: until it is fixed you can monkey-patch it: https://discourse.julialang.org/t/difficulties-with-recent-julia-releases/22969/2)

KlausC · 2019-04-11T14:41:59Z

@mbauman

Note that we're using breaking with a very specific meaning here: ... a particular change caused a regression that broke someone's workflow that previously worked

Does that mean, we don't introduce data consistency checks, because someone's program worked, although the data, it operated on, were partially corrupted?

In our case, I agree, that the introduced check was too restrictive, but generally ...?

mbauman · 2019-04-11T15:00:34Z

It just means that we need to be careful about doing so. We flag all such changes as minor changes, we ensure they appear in NEWS.md, and such changes have a higher bar to cross to get merged.

In cases where the previous behavior resulted in corrupted or buggy answers, then that's not breaking — it's a bug. Admittedly sometimes the line between the two can be blurry.

In this case, I still really want to see a resurrected #31118+#31661 that fixes the segfault and protects against overfilling such arrays. Had we not been right on a branching threshold, we probably would have let things percolate a little longer without a direct revert.

ViralBShah · 2019-04-11T16:34:37Z

We always want correctness, but this was a tricky situation as already discussed above. Reverting and redoing is certainly the right thing.

Pbellive · 2019-04-11T17:08:22Z

I'm happy to update and expand this PR to make an attempt at addressing the concerns raised in this thread. I should be able to get to it in the next 4-5 days.

KlausC · 2019-04-14T20:44:52Z

@Pbellive sorry for taking your job. I hope you can live with this solution.

Make SparseMatrixCSC colptr and rowval integer type restriction less …

ab1c063

…restrictive. Integer type needs to be wide enough to store matrix with number of non-zeros given, not any sparse matrix of equal size.

ararslan added bugfix This change fixes an existing bug sparse Sparse arrays labels Apr 9, 2019

mbauman approved these changes Apr 9, 2019

View reviewed changes

ararslan added this to the 1.2 milestone Apr 9, 2019

ararslan mentioned this pull request Apr 10, 2019

Set VERSION to 1.3-DEV #31660

Merged

KlausC reviewed Apr 10, 2019

View reviewed changes

fredrikekre reviewed Apr 10, 2019

View reviewed changes

KlausC reviewed Apr 10, 2019

View reviewed changes

mbauman assigned mbauman and unassigned mbauman Apr 10, 2019

KristofferC mentioned this pull request Apr 10, 2019

Revert "Check if Ti is sufficient for size of SparseMatrixCSC (#31118)" #31667

Merged

ararslan removed this from the 1.2 milestone Apr 10, 2019

KlausC mentioned this pull request Apr 14, 2019

Argument checks for SparseMatrixCSC constructors #31724

Merged

ViralBShah closed this Apr 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loosen SparseMatrixCSC colptr and rowval integer type restriction #31661

Loosen SparseMatrixCSC colptr and rowval integer type restriction #31661

Pbellive commented Apr 9, 2019

ViralBShah commented Apr 10, 2019

KlausC left a comment

fredrikekre Apr 10, 2019

KlausC commented Apr 10, 2019 •

edited

Loading

KlausC Apr 10, 2019

KlausC Apr 10, 2019

ViralBShah commented Apr 10, 2019

mbauman commented Apr 10, 2019

KristofferC commented Apr 10, 2019

KlausC commented Apr 10, 2019

KristofferC commented Apr 10, 2019

mbauman commented Apr 10, 2019

Pbellive commented Apr 10, 2019

mauro3 commented Apr 10, 2019

KlausC commented Apr 11, 2019

mbauman commented Apr 11, 2019

ViralBShah commented Apr 11, 2019 •

edited

Loading

Pbellive commented Apr 11, 2019

KlausC commented Apr 14, 2019

	!isbitstype(Ti) \|\| nnzval+1 ≤ typemax(Ti) \|\| throwTi("Number of nonzeros", "nnz+1", nnzval+1)
	!isbitstype(Ti) \|\| nnzval < typemax(Ti) \|\| throwTi("number of nonzeros", "nnz+1", nnzval+1)

	sparse_check_Ti(m, n, length(nzval), Ti)
	sparse_check_Ti(m, n, max(length(nzval), length(rowval)), Ti)

	sparse_check_Ti(m, n, length(V), Ti)
	sparse_check_Ti(m, n, length(I), Ti)

Loosen SparseMatrixCSC colptr and rowval integer type restriction #31661

Loosen SparseMatrixCSC colptr and rowval integer type restriction #31661

Conversation

Pbellive commented Apr 9, 2019

ViralBShah commented Apr 10, 2019

KlausC left a comment

Choose a reason for hiding this comment

fredrikekre Apr 10, 2019

Choose a reason for hiding this comment

KlausC commented Apr 10, 2019 • edited Loading

KlausC Apr 10, 2019

Choose a reason for hiding this comment

KlausC Apr 10, 2019

Choose a reason for hiding this comment

ViralBShah commented Apr 10, 2019

mbauman commented Apr 10, 2019

KristofferC commented Apr 10, 2019

KlausC commented Apr 10, 2019

KristofferC commented Apr 10, 2019

mbauman commented Apr 10, 2019

Pbellive commented Apr 10, 2019

mauro3 commented Apr 10, 2019

KlausC commented Apr 11, 2019

mbauman commented Apr 11, 2019

ViralBShah commented Apr 11, 2019 • edited Loading

Pbellive commented Apr 11, 2019

KlausC commented Apr 14, 2019

KlausC commented Apr 10, 2019 •

edited

Loading

ViralBShah commented Apr 11, 2019 •

edited

Loading