make cumsum result type consistent with sum #9665

stevengj · 2015-01-07T18:26:30Z

This means that cumsum(Int8[1,2,3]) is Int[1,3,6] rather than an Int8 array, for example. If you really want to do the sum as Int8, you can still do cumsum!(Array(Int8,3), Int8[1,2,3]).

In principle, this is a breaking change, but I'm not sure if any code in the wild actually relied on the lack of promotion in the cumsum results? (The most common usage seems to be for Float64 and Float32 arrays, which are unaffected.)

Also, this makes cumsum! and cumsum use the same pairwise code for 1d arrays (whereas previously cumsum! used a completely different codepath based on the one for multidimensional arrays, which just does naive summation). Ideally, we'd use the pairwise algorithm for multidimensional arrays too, but that requires a much bigger code rewrite.

(See discussion in #9650 with @timholy.)

…use same codepath for 1d arrays (see discussion in JuliaLang#9650)

stevengj · 2015-01-07T20:09:43Z

Hmm, getting:

    From worker 8:       * linalg/givens
exception on 6: ERROR: `convert` has no method matching convert(::Type{Base.SparseMatrix.SparseMatrixCSC{Tv,Ti<:Integer}}, ::Int64, ::Int64, ::Array{Int64,1}, ::Array{Int32,1}, ::Array{Float64,1})
This may have arisen from a call to the constructor Base.SparseMatrix.SparseMatrixCSC{Tv,Ti<:Integer}(...), since type constructors fall back to convert methods in julia v0.4.
 in getindex_general at sparse/sparsematrix.jl:1114
 in getindex at sparse/sparsematrix.jl:1122
 in anonymous at no file:18
 in runtests at /Users/stevenj/Code/julia/test/testdefs.jl:5
 in anonymous at multi.jl:852
 in run_work_thunk at multi.jl:603
 in anonymous at task.jl:852
while loading linalg/umfpack.jl, in expression starting on line 13

Not sure how this patch is causing that failure, since cumsum is not used in base/linalg/*.

andreasnoack · 2015-01-07T21:17:17Z

Could it be https://github.com/JuliaLang/julia/blob/master/base/sparse/sparsematrix.jl#L1078 that now widens the integer type?

ivarne · 2015-01-07T21:59:14Z

+1 on the pairwise behavior.

Neutral on the promotion. The promotion argument is strong for sum and other reductions that transform an array to a scalar, because you get better accuracy at a very low cost. For cumsum you actually pay the cost of using a bigger result array, but if that is a concern you probably can't afford the implicit copy and should use cumsum! anyway.

stevengj · 2015-01-07T22:26:10Z

Yikes, yes, cumsum is used all over the sparse code, and the promotion will be a problem for Int32 index types.

ViralBShah · 2015-01-08T13:23:35Z

It shouldn't be difficult to adapt the sparse code to this change, if we want it.

timholy · 2015-01-08T14:28:24Z

+1 for this

tkelman · 2015-01-08T14:42:14Z

Seems odd to me to have cumsum and cumsum! giving you different types. I thought we've generally been moving away from integer promotion, not towards more of it?

timholy · 2015-01-08T14:50:40Z

It also seems odd to have cumsum(x)[end] != sum(x), which is what you'll get from cumsum(rand(UInt8, 1000)).

While we removed promotion from ::UInt8 + ::UInt8 (because there was no way for a user to turn it off, whereas a user can turn it on with UInt(x)+y), we added integer promotion in sum.

stevengj · 2015-01-08T15:17:29Z

@tkelman, cumsum! gives you whatever type you want, because you specify the output type via its first argument. If you want it to give the same type as cumsum, you can.

tkelman · 2015-01-08T15:39:01Z

I have no problem with cumsum and sum behaving differently with respect to type promotion - the whole point of cumsum is that you need to use the running sum array for some following calculation. Changing the element type in this operation feels a lot like changing the element type in adding two arrays. Overflow is a valid concern, but isn't the widening workaround of cumsum! just as usable there?

nalimilan · 2015-01-08T15:42:18Z

OTOH if sum(x::Float32) overflows, then cumsum(x::Float32) overflows too for the last elements. So widening the type in the former case and not the latter does not make much sense, even if the cost associated to the widening is of course larger for cumsum.

tkelman · 2015-01-08T15:43:08Z

I'm also concerned about this breaking all of the sparse code. We can grep for cumsum and replace everything, but that's causing churn in code that's already known to be poorly tested, fragile, and in need of a major rewrite.

ViralBShah · 2015-01-08T16:42:43Z

There are only 6 mentions of cumsum in the sparse code. I don't think that is hard to fix.

Off-topic, but it would be nice to know what you'd want to do in a rewrite of the sparse code. Perhaps file an issue with your thoughts? I plan to hack on it over the next few days.

tkelman · 2015-01-08T16:48:28Z

And elsewhere in base? In packages that deal with sparse matrices?

it would be nice to know what you'd want to do in a rewrite of the sparse code. Perhaps file an issue with your thoughts?

I already more or less did, #8001. I'd like sparse matrices to be first class in terms of the array operations and reductions that work efficiently on them. A few days of hacking can improve things gradually, but I'm envisioning a more comprehensive rewrite that will need changes to the way the rest of the standard library interacts with array objects to make sparse or dense both work well. A more formal concept of interfaces, and a few generic iterator protocols will likely be required here.

timholy · 2015-01-08T17:00:47Z

Just to make sure the bikeshedding stays in perspective:

This patch contains an absolutely essential change, implementing cumsum as a call to cumsum! and allowing users to control the output type themselves. This should darn well be uncontroversial and should go in ASAP.
The only concern is over a line or two of code, whether the default should be a promoted type or the same type. Since this is pretty trivial, I'm going to refrain from arguing this into the ground. My final pitch to @tkelman is that when you add two arrays elementwise, each element of the sum is just a+b, i.e., you're only summing two numbers. When you call cumsum, you're forming a sum of length(A). That's basically guaranteed to overflow for small integer types. But like I said, I don't care much one way or another. What I care about is that this patch allows me to assume control of the answer.

tkelman · 2015-01-08T17:04:00Z

This patch contains an absolutely essential change, implementing cumsum as a call to cumsum! and allowing users to control the output type themselves. This should darn well be uncontroversial and should go in ASAP.

Yes, that part absolutely makes sense.

When you call cumsum, you're forming a sum of length(A)

You're also returning an array. I really strongly dislike element type promotion in functions that return arrays.

stevengj · 2015-01-08T17:04:12Z

@timholy, you don't need this patch to control the answer, because cumsum! already exists in Julia. The only difference here is that cumsum! for 1d arrays calls the pairwise code rather than doing naive summation. (But the pairwise stuff is irrelevant for integer types.)

But that being said, I'm fine with dropping the promotion, and just pushing the patch for the change in cumsum! to the pairwise implementation, since it seems like the case for promotion is ambiguous.

timholy · 2015-01-08T17:09:43Z

Sounds fine, let's do it that way. We may want to add something to the docs for cumsum, something like:

Cumulative sum along a dimension. The dimension defaults to 1. See also 'cumsum!', especially if the computation might overflow.

ivarne · 2015-01-08T17:17:32Z

There is nothing in cumsum! to prevent overflow, so that suggestion seems slightly misleading. You can overflow any fixed with integer type, so no automatic promotion will fix that for us. How about

See also `cumsum!` which lets you specify a different array-
type to store the result in. This might be useful if the sum
might otherwise overflow, as there is no overflow detection in
this function.

timholy · 2015-01-08T17:19:07Z

That's fine, and closer to what I originally wrote. I was just trying to be brief.

ivarne · 2015-01-08T17:22:53Z

Why brief, when we get so much complains that people don't understand our documentation and people beg for examples?

ivarne · 2015-01-08T17:24:16Z

Seems like @stevengj just nailed it and wrote something both brief and correct.

timholy · 2015-01-08T17:24:31Z

Entirely reasonable.

make cumsum result type consistent with sum, make cumsum! and cumsum …

b19d8f8

…use same codepath for 1d arrays (see discussion in JuliaLang#9650)

stevengj added the kind:breaking This change will break code label Jan 7, 2015

stevengj referenced this pull request Jan 7, 2015

update NEWS and FAQ entry on nothingness

88a4725

NEWS for 9665

c829944

stevengj closed this Jan 8, 2015

stevengj added a commit to stevengj/julia that referenced this pull request Jan 8, 2015

make cumsum call cumsum! (see JuliaLang#9665)

4e18082

stevengj mentioned this pull request Jan 8, 2015

make cumsum call cumsum! #9683

Merged

tkelman mentioned this pull request Sep 23, 2015

problem with cumsum #13244

Closed

JeffBezanson mentioned this pull request Apr 19, 2018

Change accumulation promotion behaviour #26658

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make cumsum result type consistent with sum #9665

make cumsum result type consistent with sum #9665

stevengj commented Jan 7, 2015

stevengj commented Jan 7, 2015

andreasnoack commented Jan 7, 2015

ivarne commented Jan 7, 2015

stevengj commented Jan 7, 2015

ViralBShah commented Jan 8, 2015

timholy commented Jan 8, 2015

tkelman commented Jan 8, 2015

timholy commented Jan 8, 2015

stevengj commented Jan 8, 2015

tkelman commented Jan 8, 2015

nalimilan commented Jan 8, 2015

tkelman commented Jan 8, 2015

ViralBShah commented Jan 8, 2015

tkelman commented Jan 8, 2015

timholy commented Jan 8, 2015

tkelman commented Jan 8, 2015

stevengj commented Jan 8, 2015

timholy commented Jan 8, 2015

ivarne commented Jan 8, 2015

timholy commented Jan 8, 2015

ivarne commented Jan 8, 2015

ivarne commented Jan 8, 2015

timholy commented Jan 8, 2015

make cumsum result type consistent with sum #9665

make cumsum result type consistent with sum #9665

Conversation

stevengj commented Jan 7, 2015

stevengj commented Jan 7, 2015

andreasnoack commented Jan 7, 2015

ivarne commented Jan 7, 2015

stevengj commented Jan 7, 2015

ViralBShah commented Jan 8, 2015

timholy commented Jan 8, 2015

tkelman commented Jan 8, 2015

timholy commented Jan 8, 2015

stevengj commented Jan 8, 2015

tkelman commented Jan 8, 2015

nalimilan commented Jan 8, 2015

tkelman commented Jan 8, 2015

ViralBShah commented Jan 8, 2015

tkelman commented Jan 8, 2015

timholy commented Jan 8, 2015

tkelman commented Jan 8, 2015

stevengj commented Jan 8, 2015

timholy commented Jan 8, 2015

ivarne commented Jan 8, 2015

timholy commented Jan 8, 2015

ivarne commented Jan 8, 2015

ivarne commented Jan 8, 2015

timholy commented Jan 8, 2015