RFC: Linear indexing with multidimensional index. #2247

GunnarFarneback · 2013-02-09T17:22:05Z

This change makes array indexing more general and since indexing is automatically controversial, let's start the discussion.

Today indexing a[b], when a is an array and b is an array with more than 1 dimension, gives a no method error for check_bounds. In contrast Matlab produces an array of the same shape as b, or effectively reshape(a[b[:]], size(b)). I've found this indexing quite handy, especially in situations like
img_lut = lut[1 + img]
to apply a lookup table transformation to an image or
I = [1 4 5; 4 2 6; 5 6 3]
A = a[I]
to produce matrices with certain structures.

For indexing with multiple indices, e.g. a[b, c], this patch effectively gives a[b[:], c[:]]. When b and c are row or column vectors this is very reasonable. When they are full matrices it's arguable but happens to be consistent with Matlab.

JeffBezanson · 2013-02-10T01:02:26Z

I'm not against this. Let's see what everybody thinks.
One thing is for sure: it is amazing that this only requires changing two lines.

ViralBShah · 2013-02-10T01:09:00Z

This is quite convenient, and I frequently used to do things like this when I used matlab. This also does not break existing behaviour.

We also need to add this to the documentation.

timholy · 2013-02-10T10:59:39Z

Agreed, this is useful. I struggle a bit with knowing what the behavior of A[b,c] should be in cases where A is 3-dimensional, b is two-dimensional, and c one-dimensional. Should the indices in b be measured against size(A,1) or size(A,1)*size(A,2)? I think the latter may be more useful, but I think it does not agree with Matlab's behavior and would be slightly harder to achieve.

ViralBShah · 2013-02-10T20:24:00Z

I have marked this for 0.2.

toivoh · 2013-02-13T21:04:50Z

I also think that this kind of indexing is very useful, and have found use for it on a number of occasions (especially the generalization from numpy, described below). But it's a very different beast from the the kind of indexing that we have now, lets call it axiswise indexing (in full-dimensional and linear flavors).

I found section 3.4 in the Guide to NumPy to be enlightening, in several ways:

Short python glossary:
sequence: anything that can be indexed like a 1d array
slice: a range in Julia

I interpret the different kinds of indexing described as follows:

Basic slicing is pretty much equivalent to full-dimensional indexing in Julia as it is right now (e.g. axiswise), restricted to indexes that are integers and ranges.
Advanced selection (the integer kind) is a generalization of what this pull request proposes. It is basically pointwise indexing, i.e. if I let X = pointref(A, inds...), then for any valid indexes ks... into X,
```
X[ks...] = A[[ind[ks...] for ind in inds]...]
```
if inds is an iterable of same-size arrays (X will also be this size). The index arrays are conceptually broadcasted to the same shape first; if they can't, it is an error.
Partial indexing is a mixture of the two above. Quoting the pdf:

If the index subspaces are right next to each other, then the broadcasted indexing space directly replaces all of the indexed subspaces in X. If the indexing subspaces are separated (by slice objects), then the broadcasted indexing space is first, followed by the sliced subspace of X.

Pretty complex.

My conclusion is that axiswise and pointwise indexing (basic slicing and advanced selection in numpy) are very different things. They do sometimes coincide, but it's probably impossible to come up with a consistent way to overload them in the same indexing behavior, especially with any hope of being able to explain how it works.

I would personally like to see this issue resolved by creating a new pointwise indexing operator .[] corresponding to pointref above. @GunnarFarneback's example above could then be written as

A=a.[l]

A nice thing to note is that that this pointwise indexing operator pretty much corresponds exactly to what is achievable in parallel in CUDA or OpenCL when indexing into an array in working memory. Also note that my suggestion assigns an entirely different meaning to A[B, C] than above: broadcast B and C to the same shape,
then map each corresponding pair of elements (b,c) into the element of the result a[b,c] at the same position.

Perhaps I should open a new issue to discuss this?

ViralBShah · 2013-03-17T12:44:46Z

I'd like to merge this, unless there is any objection. Needs a rebase so that the pull request can be merged.

timholy · 2013-03-17T13:24:45Z

I'd be supportive. We can define other functions that do other types of indexing later.

toivoh · 2013-03-17T13:30:41Z

I think that this abstraction is fundamentally at odds with the one that we currently have for indexing, and that trying to make the two coexist in the same indexing operator will inevitably lead to clashes and inconsistencies. Let me elaborate my suggestion for a separate pointwise indexing operator, I hope that that will shed some light on this issue.

toivoh · 2013-03-17T14:34:18Z

Ok, I thought this through some more (see #2591). I'm for this as long as we give an error for indexing with more than one index (when at least one index is multidimensional). That case I think we have to discuss some more. I assume that this patch makes

A[B] = (A[:])[B]

That seems like the reasonable thing to do with linear indexing, and is also consistent with #2591.

GunnarFarneback · 2013-03-19T06:59:40Z

Rebase done.

ViralBShah · 2013-03-21T03:49:12Z

@GunnarFarneback I think this only applies to indexing with one argument, looking at your patch. @toivoh Can you say if this patch is ok in its current form? If so, we can go ahead and merge it, and then tackle #2591 later.

toivoh · 2013-03-21T06:19:51Z

I tried this out, and I'm still missing an error message when I try to use multidimensional indexes with more than one index. Not sure how to best implement that; I might be able to look into it in the next few days if no one else does. This change seems to alter the foundations that all indexing methods rely on. I would be more comfortable with just changing the implementation of indexing methods that take one index.

ViralBShah · 2013-03-21T06:23:00Z

Good point. I agree that we should try to do this only in the one index case.

GunnarFarneback · 2013-03-21T23:28:48Z

Those innocent-looking lines in the patch have much more effect than one would think because indexing with both single and multiple multidimensional indices was already implemented, it just barfed on check_bounds being much more restrictive (and without the index_shape modification would have produced a linear output with one index).

Restricting to only allowing one multidimensional index is more intricate. It probably involves changing getindex(A::Array, I::Union(Real,AbstractArray)...) on array.c line 330 to AbstractVector, introducing a new getindex(A::Array, I::AbstractArray), and finding the right way to solve the ambiguity with getindex{T<:Real}(A::AbstractArray, I::AbstractVector{T}) on line 271. I don't quite understand why that getindex function works on an AbstractArray when all other getindex functions in array.c work on an Array.

toivoh mentioned this pull request Mar 17, 2013

Broadcasting pointwise indexing operator #2591

Open

Support a[b] indexing when b is multidimensional.

a8cbb44

JeffBezanson closed this in b8e41d1 Apr 3, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Linear indexing with multidimensional index. #2247

RFC: Linear indexing with multidimensional index. #2247

GunnarFarneback commented Feb 9, 2013

JeffBezanson commented Feb 10, 2013

ViralBShah commented Feb 10, 2013

timholy commented Feb 10, 2013

ViralBShah commented Feb 10, 2013

toivoh commented Feb 13, 2013

ViralBShah commented Mar 17, 2013

timholy commented Mar 17, 2013

toivoh commented Mar 17, 2013

toivoh commented Mar 17, 2013

GunnarFarneback commented Mar 19, 2013

ViralBShah commented Mar 21, 2013

toivoh commented Mar 21, 2013

ViralBShah commented Mar 21, 2013

GunnarFarneback commented Mar 21, 2013

RFC: Linear indexing with multidimensional index. #2247

RFC: Linear indexing with multidimensional index. #2247

Conversation

GunnarFarneback commented Feb 9, 2013

JeffBezanson commented Feb 10, 2013

ViralBShah commented Feb 10, 2013

timholy commented Feb 10, 2013

ViralBShah commented Feb 10, 2013

toivoh commented Feb 13, 2013

ViralBShah commented Mar 17, 2013

timholy commented Mar 17, 2013

toivoh commented Mar 17, 2013

toivoh commented Mar 17, 2013

GunnarFarneback commented Mar 19, 2013

ViralBShah commented Mar 21, 2013

toivoh commented Mar 21, 2013

ViralBShah commented Mar 21, 2013

GunnarFarneback commented Mar 21, 2013