Skip to content

Commit

Permalink
Document repeat function
Browse files Browse the repository at this point in the history
Extend ind2sub and sub2ind to accept a reusable array of indices
Expose rem1 and fld1, defined by analogy with mod1
  • Loading branch information
johnmyleswhite committed Jul 10, 2013
1 parent b34e95b commit a559c56
Show file tree
Hide file tree
Showing 5 changed files with 55 additions and 32 deletions.
77 changes: 45 additions & 32 deletions base/abstractarray.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1045,6 +1045,17 @@ function sub2ind(dims, I::Integer...)
return index
end

function sub2ind{T<:Integer}(dims::Array{T}, sub::Array{T})
ndims = length(dims)
ind = sub[1]
stride = 1
for k in 2:ndims
stride = stride * dims[k - 1]
ind += (sub[k] - 1) * stride
end
return ind
end

sub2ind{T<:Integer}(dims, I::AbstractVector{T}...) =
[ sub2ind(dims, map(X->X[i], I)...)::Int for i=1:length(I[1]) ]

Expand All @@ -1057,23 +1068,6 @@ ind2sub(dims::(Integer,Integer,Integer), ind::Int) =
(rem(ind-1,dims[1])+1, div(rem(ind-1,dims[1]*dims[2]), dims[1])+1,
div(rem(ind-1,dims[1]*dims[2]*dims[3]), dims[1]*dims[2])+1)

function ind2sub(dims::(Integer,Integer...), ind::Int)
ndims = length(dims)
stride = dims[1]
for i=2:ndims-1
stride *= dims[i]
end

sub = ()
for i=(ndims-1):-1:1
rest = rem(ind-1, stride) + 1
sub = tuple(div(ind - rest, stride) + 1, sub...)
ind = rest
stride = div(stride, dims[i])
end
return tuple(ind, sub...)
end

function ind2sub{T<:Integer}(dims::(Integer,Integer...), ind::AbstractVector{T})
n = length(dims)
l = length(ind)
Expand All @@ -1087,55 +1081,74 @@ function ind2sub{T<:Integer}(dims::(Integer,Integer...), ind::AbstractVector{T})
return t
end

function ind2sub!{T<:Integer}(sub::Array{T}, dims::Array{T}, ind::T)

This comment has been minimized.

Copy link
@timholy

timholy Jul 24, 2013

Member

I would leave the type of dims undeclared, so people can use tuples too.

Your single biggest bottleneck is the call to this function, although no one line dominates. For potentially higher performance, I would experiment with adding specialized methods:

ind2sub!{T<:Integer}(sub::Array{T}, dims::Integer, ind::T)
ind2sub!{T<:Integer}(sub::Array{T}, dims::(Integer,), ind::T)
ind2sub!{T<:Integer}(sub::Array{T}, dims::(Integer,Integer), ind::T)
ind2sub!{T<:Integer}(sub::Array{T}, dims::(Integer,Integer,Integer), ind::T)

that don't use a for-loop. I would then create size_out with ntuple (keeping the result as a tuple), which would allow these dimensionally-specific variants to be called when appropriate. That might also have another benefit: your line R = Array(T, size_out...) accounts for almost 10% of the time, and believe it or not it's almost all in the size_out... part (array/tuple conversions are frustratingly expensive).

The upper bound on savings from these steps is a 50% speed increase, so nothing too dramatic, but might cut substantially into Matlab's minor speed advantage.

ndims = length(dims)
stride = dims[1]
for i in 2:(ndims - 1)
stride *= dims[i]
end
for i in (ndims - 1):-1:1
rest = rem1(ind, stride)
sub[i + 1] = div(ind - rest, stride) + 1
ind = rest
stride = div(stride, dims[i])
end
sub[1] = ind
return
end

indices(I) = I
indices(I::Int) = I
indices(I::Real) = convert(Int, I)
indices(I::AbstractArray{Bool,1}) = find(I)
indices(I::Tuple) = map(indices, I)

# Generalized repmat
fld1{T <: Integer}(x::T, y::Integer) = fld(x - one(T), y) + one(T)

function repeat{T}(A::Array{T};
inner::Array{Int} = ones(Int, ndims(A)),
outer::Array{Int} = ones(Int, ndims(A)))
size_A = size(A)
ndims_A = ndims(A)
length_inner, length_outer = length(inner), length(outer)
ndims_out = max(ndims_A, length_inner, length_outer)
ndims_in = ndims(A)
length_inner = length(inner)
length_outer = length(outer)
ndims_out = max(ndims_in, length_inner, length_outer)

if length_inner < ndims_A || length_outer < ndims_A
error("Inner/outer repetitions must be set for all input dimensions")
if length_inner < ndims_in || length_outer < ndims_in
msg = "Inner/outer repetitions must be set for all input dimensions"
throw(ArgumentError(msg))
end

size_in = Array(Int, ndims_in)
size_out = Array(Int, ndims_out)
inner_size_out = Array(Int, ndims_out)

for i in 1:ndims_in
size_in[i] = size(A, i)
end
for i in 1:ndims_out
t1 = ndims_A < i ? 1 : size_A[i]
t1 = ndims_in < i ? 1 : size_in[i]
t2 = length_inner < i ? 1 : inner[i]
t3 = length_outer < i ? 1 : outer[i]
size_out[i] = t1 * t2 * t3
inner_size_out[i] = t1 * t2
end

length_out = prod(size_out)
indices_in = Array(Int, ndims_in)
indices_out = Array(Int, ndims_out)

length_out = prod(size_out)
R = Array(T, size_out...)

for index_out in 1:length_out
indices_out = ind2sub(tuple(size_out...), index_out)
indices_in = Array(Int, length(indices_out))
for t in 1:length(indices_out)
indices_in[t] = indices_out[t]
ind2sub!(indices_out, size_out, index_out)
for t in 1:ndims_in
# "Project" outer repetitions into inner repetitions
indices_in[t] = mod1(indices_out[t], inner_size_out[t])
# Find inner repetitions using flooring division
if inner[t] != 1
indices_in[t] = fld1(indices_in[t], inner[t])
end
end
index_in = sub2ind(size_A, tuple(indices_in[1:ndims_A]...)...)
index_in = sub2ind(size_in, indices_in)
R[index_out] = A[index_in]
end

Expand Down
2 changes: 2 additions & 0 deletions base/exports.jl
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,7 @@ export
factor,
factorial,
fld,
fld1,
flipsign,
float,
#float16,
Expand Down Expand Up @@ -404,6 +405,7 @@ export
reim,
reinterpret,
rem,
rem1,
round,
sec,
secd,
Expand Down
2 changes: 2 additions & 0 deletions base/operators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,8 @@ const % = rem

# mod returns in [0,y) whereas mod1 returns in (0,y]
mod1{T<:Real}(x::T, y::T) = y-mod(y-x,y)
rem1{T<:Real}(x::T, y::T) = rem(x-1,y)+1
fld1{T<:Real}(x::T, y::T) = fld(x-1,y)+1

# cmp returns -1, 0, +1 indicating ordering
cmp{T<:Real}(x::T, y::T) = int(sign(x-y))
Expand Down
2 changes: 2 additions & 0 deletions base/promotion.jl
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,8 @@ rem(x::Real, y::Real) = rem(promote(x,y)...)
mod(x::Real, y::Real) = mod(promote(x,y)...)

mod1(x::Real, y::Real) = mod1(promote(x,y)...)
rem1(x::Real, y::Real) = rem1(promote(x,y)...)
fld1(x::Real, y::Real) = fld1(promote(x,y)...)
cmp(x::Real, y::Real) = cmp(promote(x,y)...)

max(x::Real, y::Real) = max(promote(x,y)...)
Expand Down
4 changes: 4 additions & 0 deletions doc/stdlib/linalg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,10 @@ Linear algebra functions in Julia are largely implemented by calling functions f

Construct a matrix by repeating the given matrix ``n`` times in dimension 1 and ``m`` times in dimension 2.

.. function:: repeat(A, inner = Int[], outer = Int[])

Construct an array by repeating the entries of ``A``. The i-th element of ``inner`` specifies the number of times that the individual entries of the i-th dimension of ``A`` should be repeated. The i-th element of ``outer`` specifies the number of times that a slice along the i-th dimension of ``A` should be repeated.
.. function:: kron(A, B)
Kronecker tensor product of two vectors or two matrices.
Expand Down

4 comments on commit a559c56

@StefanKarpinski
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this means we should have div1 also, although perhaps only fld1 and mod1 really make sense?

@johnmyleswhite
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the whole gamut will be useful since they make it a lot easier to deal with 1-based indexing.

@StefanKarpinski
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right – for indexing, you generally want mod1 since it always puts you in the 1:n range even if the argument is negative. It's hard for me to think of a use case for rem1, which ends up in the 1:-1:2-n range if the argument is negative (or should it be -1:-1:n?). The only usage I can think of is as a performance tweak for cases where we don't think the argument will ever be negative since rem may be faster than mod, but I feel like we should handle that some other way so that people can write semantically correct code and still get good performance.

@johnmyleswhite
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not that sold on rem1: I included it only for completeness since rem was being used in the previous implementation of the indexing operations I extended. I would agree that fld1 and mod1 are the only cases that really matter.

Please sign in to comment.