Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more accurate cumsum #9648

Closed
stevengj opened this issue Jan 6, 2015 · 2 comments · Fixed by #9650
Closed

more accurate cumsum #9648

stevengj opened this issue Jan 6, 2015 · 2 comments · Fixed by #9650

Comments

@stevengj
Copy link
Member

stevengj commented Jan 6, 2015

This is my fault in #4039, I think: although cumsum implements a "pairwise" algorithm, unlike the sum function it doesn't actually get any accuracy benefit from this, and is in fact equivalent to naive summation:

x = rand(10^7)
foldl(+, 0.0, x) == cumsum(x)[end]

produces true. Is there an easy way to rearrange cumsum_pairwise so that it is as accurate as sum?

cc: @lindahua, @timholy

@stevengj
Copy link
Member Author

stevengj commented Jan 6, 2015

This implementation appears to achieve accuracy equal to sum, but at the price of roughly an extra pass over the array:

function cumsum_pw(v::AbstractVector, c::AbstractVector, s, i1, n)
    if n < 128
        @inbounds c[i1] = +(s, v[i1])
        for i = i1+1:i1+n-1
            @inbounds c[i] = +(c[i-1], v[i])
        end
    else
        n2 = div(n,2)
        csp(v, c, s, i1, n2)
        csp(v, c, zero(s), i1+n2, n-n2)
        @inbounds s_ = c[(i1+n2)-1]
        for i = i1+n2:i1+n-1
            @inbounds c[i] += s_
        end
    end
    c
end

In a quick benchmark it takes about twice as long as cumsum.

@stevengj stevengj changed the title no accuracy benefit from pairwise cumsum more accurate cumsum Jan 6, 2015
@stevengj
Copy link
Member Author

stevengj commented Jan 6, 2015

Okay, this variant achieves the same accuracy as sum without the extra pass, and is actually faster than our current cumsum by a significant margin (more than a factor of two), probably because of the use of the temporary variable s_ in the base case just because it doesn't include the allocation part of cumsum ... compared to Base.cumsum_pairwise, the speed is indistinguishable:

function csp!(v::AbstractVector, c::AbstractVector, s, i1, n)
    if n < 128
        @inbounds s_ = v[i1]
        @inbounds c[i1] = +(s, s_)
        for i = i1+1:i1+n-1
            @inbounds s_ = +(s_, v[i])
            @inbounds c[i] = +(s, s_)
        end
        return s_
    else
        n2 = div(n,2)
        s_ = csp!(v, c, s, i1, n2)
        s_ += csp!(v, c, s + s_, i1+n2, n-n2)
        return s_
    end
end

stevengj added a commit to stevengj/julia that referenced this issue Jan 6, 2015
stevengj added a commit that referenced this issue Jan 7, 2015
get pairwise-sum accuracy for cumsum (fix #9648)
stevengj added a commit that referenced this issue Jan 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant