Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate dependence on Regex from cartesian.jl #10310

Merged
merged 1 commit into from
Feb 24, 2015
Merged

Conversation

timholy
Copy link
Member

@timholy timholy commented Feb 24, 2015

CC @JeffBezanson

In some ways it would be nice to move this earlier in the boot sequence, but until string.jl gets loaded there seems to be little point. So perhaps it's best left where it is. CC @andreasnoack.

@kmsquire
Copy link
Member

Just curious, what's the purpose of removing the dependent on regex here?

@JeffBezanson
Copy link
Member

It's nice for basic numerical functionality not to depend on something like regexes. I think it would be even better for this stuff not to depend on symbol naming hacks. I promise to make tuples faster. Could we also use getfield on an immutable with an integer index, combined with ordinary gensyms?

@timholy
Copy link
Member Author

timholy commented Feb 24, 2015

@kmsquire, Jeff requested it in #8501 (comment).

@JeffBezanson, I'm really looking forward to faster tuples.

Could we also use getfield on an immutable with an integer index, combined with ordinary gensyms?

I believe you're asking about futuristic stuff that you already have on your machine 😄. Check out the new CartesianIndex & CartesianRange types. For example, on 0.4 we can now write reductions in a fashion that I think even you will agree is simply lovely (with the exception of needing @inbounds @simd to get highest efficiency):

A = rand(10^4, 10^4)
R1 = zeros(1, 10^4)   # for reducing along dimension 1
R2 = zeros(10^4, 1)   # for reducing along dimension 2
R12 = zeros(1, 1)       # for reducing along dimensions 1 and 2

function mysum!(R, A) 
    szR = CartesianIndex(size(R))
    @inbounds @simd for I in eachindex(A)
        J = min(I, szR)
        R[J] += A[I]
    end
    R
end

By comparison, here's the version based on Base.Cartesian:

stagedfunction stsum!{T,N}(R, A::AbstractArray{T,N})
    quote
        @nexprs $N d->(sizeR_d = size(R,d))
        @nloops $N i A d->(j_d = sizeR_d==1 ? 1 : i_d) begin
            @inbounds (@nref $N R j) += (@nref $N A i)
        end
        R
    end
end

For R1 and R12 the two are basically the same, and for R2 the version that doesn't use macros is actually faster. So between @Jutho, @ArchRobison (for the SIMD-ification), and I, we've eliminated a lot of what one formerly needed Base.Cartesian for.

But it's not obvious that everything can yet be replaced. The Cartesian macros are really flexible.

@JeffBezanson
Copy link
Member

Yes, that is a HUGE improvement 👏.

Maybe Cartesian could use j[d] instead of j_d? I always found this notation a bit jarring compared to the rest of the language.

@timholy
Copy link
Member Author

timholy commented Feb 24, 2015

...for instance, this implementation of stsum! blows mysum! out of the water:

stagedfunction stsum!{T,N}(R, A::AbstractArray{T,N})
    quote
        @nexprs $N d->(sizeR_d = size(R,d))
        if size(R, 1) == 1
            # We're reducing along dimension 1, in which case we can accumulate to a scalar
            sizeA1 = size(A, 1)
            @nloops $N i d->(d>1? (1:size(A,d)) : (1:1)) d->(j_d = sizeR_d==1 ? 1 : i_d) begin
                @inbounds r = (@nref $N R j)
                @simd for i_1 = 1:sizeA1
                    @inbounds r += (@nref $N A i)
                end
                @inbounds (@nref $N R j) = r
            end
        else
            # We're not reducing along dimension 1, so must accumulate to an array
            @nloops $N i A d->(j_d = sizeR_d==1 ? 1 : i_d) begin
                @inbounds (@nref $N R j) += (@nref $N A i)
            end
        end
        R
    end
end

@timholy
Copy link
Member Author

timholy commented Feb 24, 2015

It's definitely jarring notation. It's been so long since I designed Cartesian, I forget why I decided against it. I think I had what I thought was a good reason, but I am not at all sure.

It would be a huge project to change it now, though. If someone wants to tackle it, great, but I'm personally more interested in just eliminating Cartesian altogether, if we can get the optimizations we need without it.

timholy added a commit that referenced this pull request Feb 24, 2015
Eliminate dependence on Regex from cartesian.jl
@timholy timholy merged commit f15ecff into master Feb 24, 2015
@timholy timholy deleted the teh/cartesian_noregex branch February 24, 2015 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants