What is the official `similar` interface? #11574

johnmyleswhite · 2015-06-04T15:05:50Z

While discussing #10525, it became clear that NullableArray{Int, 1} is an anomaly with regard to some of the assumptions made about AbstractArray{T, N}. Unlike Array{Int, 1}, which has an eltype of Int, NullableArray{Int, 1} has an eltype of Nullable{Int}. It is, to my knowledge, the only AbstractArray type whose first type parameter is not equal to its eltype. (This, of course, could be changed.)

This distinction affects how one interprets something like

x = NullableArray([1, 2, 3], [false, false, false])

similar(x, Float64, 3)

versus something like

x = NullableArray([1, 2, 3], [false, false, false])

similar(x, Nullable{Float64}, 3)

In the first variant, a call to similar(x, T, dims) produces a new object of typeof(x) <: AbstractArray{T, N} whose first type parameter is the passed argument T.

In the second variant, a call to similar(x, T, dims) produces a new object of typeof(x) <: AbstractArray{T, N} whose first type parameter is eltype(T).

More generally, this raises the issue: should the second argument to similar refer to the first type parameter of the resulting AbstractArray or should it refer to the eltype of the resulting AbstractArray? Or should we require all AbstractArray objects to match the type-parameterization assumptions that the first type is always the eltype and the second is always the order of the array?

The text was updated successfully, but these errors were encountered:

tkelman · 2015-06-04T15:11:44Z

Worth thinking about whether #11557 ties into this too? It's seeming like using eltype in as many places as we do right now isn't as general as one might like.

mbauman · 2015-06-04T15:58:42Z

My stance on this is fairly firm: I think that what you choose to be the parameters of an AbstractArray subtype are implementation details specific to that type. They don't matter. You don't even need to parameterize an AbstractArray subtype (e.g., type BitArray{N} <: DenseArray{Bool,N}, or SparseMatrix{Tv,Ti} <: AbstractArray{Tv,2}). The fact that we typically do so in the same order as AbstractArray is just a code style convention.

What matters is what you report are the parameters to AbstractArray{T,N}; T must be its eltype, and N must be ndims. It's actually not possible to write methods that generically extract out the type parameters of an arbitrary leaf type, and that's a good thing (yes, I see the irony here with respect to #11547 😄).

So, for similar, I think that it'd be very confusing to allow specifying anything other than the AbstractArray's eltype because it's part of the generic AbstractArray interface. Going back to BitArray, similar(falses(10), Float64) is simply Array(Float64, 10) (even before #10525). I think that similar simply means "give me a chunk of mutable storage of a certain size that can support this element type."

StefanKarpinski · 2015-06-04T20:01:13Z

I pretty much agree with everything @mbauman says. It's a little strange that

similar(NullableArray([1], [false]), Float64)

produce an Array{Float64,1} but it seems right.

mbauman · 2015-06-04T20:40:31Z

Diagonal is an interesting case for similar in terms of its mutability (once setindex! is defined for its diagonal elements): #10001 (comment)

Is returning a Diagonal from similar mutable enough? Or does the entire array need to be mutable to satisfy the requirements of similar? What if the dimensions aren't square? Should it return a full array in that case? If so, then it'd be type unstable to return a Diagonal in the square case.

johnmyleswhite · 2015-06-04T20:42:19Z

I mostly agree with what @mbauman is saying, but I think that specific behavior is really weird. In particular, I think it's odd that similar(NullableArray([1], [false]), Float64) produces an Array, but similar(NullableArray([1], [false]), Nullable{Float64}) produces a NullableArray. What should similar(NullableArray([1], [false]), Nullable{Nullable{Float64}}) produce?

Alternatively, why are we saying that Array{Nullable{Float64}, 1) should never be in the image of similar(NullableArray, ...)} It seems strange to me that you could ever get an array back, but certain arrays could never be produced.

tkelman · 2015-06-04T20:49:47Z

Is returning a Diagonal from similar mutable enough?

I don't see how the thinking here could be any different than for SparseMatrixCSC, where similar returning anything other than another SparseMatrixCSC is just asking for an OutOfMemoryError.

What if the dimensions aren't square?

It looks like we don't have non-square Diagonal matrices right now, since the size is not explicitly stored as a field right now, it's assumed to always be square:

julia/base/linalg/diagonal.jl

Line 25 in ce90a6d

size(D::Diagonal) = (length(D.diag),length(D.diag))

#8240 would provide an equivalent though.

mbauman · 2015-06-04T21:12:08Z

I think similar is just spelled wrong for the way we've been using it. If I remember right, I think its name led to a lot of my initial confusion on its purpose. Perhaps a better spelling would be alloc, or maybe even AbstractArray. I see it as a way to ask for an AbstractArray in such a way that an existing array can say something about what kind of array is returned. So @johnmyleswhite: similar(NullableArray([1], [false]), Nullable{Nullable{Float64}}) should yield a NullableArray{Nullable{Float64}}.

nalimilan · 2015-06-04T21:19:53Z

Indeed, it helps to think about NullableArray{T} as being an efficient but fully compatible replacement for Array{Nullable{T}}.

StefanKarpinski · 2015-06-04T21:30:58Z

This is a good point: when similar was first introduced, we could not call abstract types. Now it might make more sense for these to be methods of abstract types like AbstractArray.

tkelman · 2015-06-04T21:33:10Z

What does call on an abstract type even mean? I guess the answer is "whatever we want it to," but that doesn't instill much confidence.

StefanKarpinski · 2015-06-04T21:37:22Z

"Give me a reasonable value of this type". For example, you could write Integer(x) to get a reasonable kind of Integer for x – e.g. if x is Float64 then an Int, if x is BigFloat then a BigInt. Admittedly this is a little different since you're not converting a single value. I'm not saying it's what we should do, but it's worth considering as something that we now have the ability to do.

mbauman · 2015-06-04T21:42:53Z

I was just spit-balling, but I'm liking it more and more. It certianly solves @johnmyleswhite's initial question: it'd be more obvious that AbstractArray(A, T, dims) gives you some subtype of AbstractArray{T}.

tkelman · 2015-06-04T21:46:45Z

I'm liking it less and less. It's really "give me a reasonable value of a reasonably chosen concrete subtype of this abstract type" which is a bit too ambiguous for my liking. Especially if there start being odd mapping rules from the concrete type of A (combined with T in some interdependent way) to the concrete type of the output. If that mapping is always the identity then at least it's consistent and maybe not too bad, but anything else would feel fishy.

StefanKarpinski · 2015-06-04T21:50:45Z

+1 about the identity mapping of element types (unless, of course, a different element type is asked for).

tkelman · 2015-06-04T21:57:10Z

AbstractArray(A::AbstractArray) can have that property (and I actually meant return container type), but can AbstractArray(A::AbstractArray, T, dims) ?

mbauman · 2015-06-04T21:59:53Z

Definitely not if its going to wholesale replace similar.

Once SparseVector is introduced, we'll want:

similar(::SparseMatrix, T, ::Int) -> SparseVector{T}
similar(::SparseMatrix, T, ::Int, ::Int) -> SparseMatrix{T}
similar(::SparseMatrix, T, ::Int, ::Int, ::Int) -> SparseCOO{T} # Maybe someday

This is essential for sensible indexing behavior.

mbauman · 2015-06-05T03:13:04Z

Another thing to consider here: it'd also be nice to allow some sort of promotion to a common AbstractArray type when working with two or more arrays at the same time: #2326.

tomasaschan · 2015-06-05T07:29:59Z

I think a few basic assumptions, valid for Arrays, makes this a lot easier also for more tricky cases:

similar(A) does the same as copy(A) without the actual copy of data (instead returning uninitialized memory)
adding arguments to similar only changes that specific aspect of the output - i.e. if the user didn't supply an argument about a specific aspect, that aspect should not change.

In practice, I guess especially the first point means that if copy(A) doesn't yield an Array, then similar needs to be extended with new methods to handle this. Exactly how T relates to eltype, etc, should IMHO depend on what is expected of a call to copy.

mbauman · 2015-06-05T12:44:21Z

I like those rules, @tlycken, but I think I'd slightly amend each. I think similar(A) needs to return a different thing than copy(A) if A ~~is immutable~~ doesn't support setindex!. And I do think that there needs to be room for interaction between the container type and the element type and/or dimensionality.

Here's another idea: break similar into two parts. similar(A, Float64, (10, 10)) becomes:

T = promote_type(typeof(A), AbstractArray{Float64, 2})
T(Float64, 10, 10)

Then we make the constructor (Both T(::Type, ::Dims) and T(::Type, Int...)) a part of the required AbstractArray interface. This is now very explicit. Generic code that simply wants some sort of AbstractArray back can do so, and code specific to a type can make sure that T remains that type.

mbauman · 2015-06-05T12:46:06Z

Hm, upon looking back at that, it doesn't quite do what we want. similar typically is used to say what the output eltype must be, whereas the above would promote all the passed eltypes.

timholy · 2015-06-05T14:14:32Z

I think similar(A) needs to return a different thing than copy(A) if A is immutable.

I'm probably missing your main point, but just because the container is immutable doesn't mean the elements are. Example:

immutable MyArray{T,N} <: AbstractArray{T,N}
    data::Array{T,N}
end
A = MyArray(rand(2,2));
A.data[1,1] = 0.5

works just fine.

Also, I'm skeptical about the promote idea, because you'll always get a non-leaf type unless people write specific promote_type methods (and then there's the ambiguity problem again...).

mbauman · 2015-06-05T14:35:01Z

Yeah, I don't think promotion is quite the operation we want here. And see my edit — I was lazy and wrote immutable when I meant not-supporting-setindex!.

I'm still entertaining the idea that similar could be split up into two steps, though: some step to identify the type, and another to do the construction.

timholy · 2015-06-05T14:37:44Z

If a type doesn't support setindex!, then I think it's reasonable that it should provide a custom override of similar.

tomasaschan · 2015-06-05T14:42:02Z

Also, for many AbstractArray subtypes, it's simply not possible to define a constructor T(::Type, ::Int...) - again, looking at Interpolations.jl this would make subtyping AbstractArray contra-productive.

My point for the first bullet was that similar(A) and A should be, well, similar, in all aspects. Type, eltype, dimensions, memory layout, whatever other characteristics the concrete type of A happens to have, should all be the same for similar(A). The only difference between similar(A), copy(A) and zeros(A) should be what contents they have.

My point with the second bullet, was that when the user supplies additional arguments, it is because they want to change some specific aspect of the result. Exactly which aspect that is, and what the parameters for that should be, might vary between different concrete types. similar(::Array, T) changes both the first type parameter and the eltype of the result to T, while similar(::NullableArray, T) changes the eltype to Nullable{T}. This is perfectly fine.

While writing this, I've come to realize that I don't think there even is a sane way to define what similar(::AbstractArray, args...) is supposed to do without knowing the concrete type of the array, at least not for a very general case that allows package authors and other AbstractArray-subtypers to not define a method or two for similar.

mbauman · 2015-06-05T14:44:12Z

should be, well, similar, in all aspects

That's precisely why I think this is a naming bug. The way similar is currently used does not mean similar. :)

mbauman · 2015-06-05T14:50:11Z

Anyone have a nice name for "ask an array (or ideally arrays) for some modifiable AbstractArray that it knows can support a given element type and dimensionality"?

timholy · 2015-06-05T17:12:48Z

To me, similar doesn't imply "the same," so I don't think it's that bad. congruent might be an alternative?

johnmyleswhite · 2015-09-08T17:40:24Z

I'm ok with closing this, since I think my remaining concerns would require a very substantial reorganization of our abstractions to resolve.

tkelman · 2015-09-08T17:43:00Z

defer to #10064 for more general abstraction design?

mbauman · 2015-09-08T17:45:43Z

👍

mbauman mentioned this issue Jun 4, 2015

SubArray getindex doesn't work on immutable array types #10001

Closed

tkelman mentioned this issue Jun 8, 2015

Interfaces for immutable arrays #11610

Closed

johnmyleswhite mentioned this issue Aug 24, 2015

[WIP] Clean up codebase for a public 0.1.0 release JuliaStats/NullableArrays.jl#56

Open

tkelman mentioned this issue Sep 8, 2015

Fix sharedarray indexing regression #12964

Merged

mbauman mentioned this issue Sep 8, 2015

Improve similar documentation #13018

Merged

tkelman closed this as completed Sep 8, 2015

This was referenced Oct 21, 2015

"Triangular matrix must be square" error for square triangular matrices #13174

Closed

How best to fix similar for triangular matrices? #13731

Closed

nalimilan mentioned this issue Mar 5, 2016

map(f, c) return type #15342

Closed

Sacha0 mentioned this issue Jan 1, 2019

similar(::SparseMatrixCSC, dims) returns sparse matrix with empty space (#26560) #30435

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the official `similar` interface? #11574

What is the official `similar` interface? #11574

johnmyleswhite commented Jun 4, 2015

tkelman commented Jun 4, 2015

mbauman commented Jun 4, 2015

StefanKarpinski commented Jun 4, 2015

mbauman commented Jun 4, 2015

johnmyleswhite commented Jun 4, 2015

tkelman commented Jun 4, 2015

mbauman commented Jun 4, 2015

nalimilan commented Jun 4, 2015

StefanKarpinski commented Jun 4, 2015

tkelman commented Jun 4, 2015

StefanKarpinski commented Jun 4, 2015

mbauman commented Jun 4, 2015

tkelman commented Jun 4, 2015

StefanKarpinski commented Jun 4, 2015

tkelman commented Jun 4, 2015

mbauman commented Jun 4, 2015

mbauman commented Jun 5, 2015

tomasaschan commented Jun 5, 2015

mbauman commented Jun 5, 2015

mbauman commented Jun 5, 2015

timholy commented Jun 5, 2015

mbauman commented Jun 5, 2015

timholy commented Jun 5, 2015

tomasaschan commented Jun 5, 2015

mbauman commented Jun 5, 2015

mbauman commented Jun 5, 2015

timholy commented Jun 5, 2015

johnmyleswhite commented Sep 8, 2015

tkelman commented Sep 8, 2015

mbauman commented Sep 8, 2015

What is the official similar interface? #11574

What is the official similar interface? #11574

Comments

johnmyleswhite commented Jun 4, 2015

tkelman commented Jun 4, 2015

mbauman commented Jun 4, 2015

StefanKarpinski commented Jun 4, 2015

mbauman commented Jun 4, 2015

johnmyleswhite commented Jun 4, 2015

tkelman commented Jun 4, 2015

mbauman commented Jun 4, 2015

nalimilan commented Jun 4, 2015

StefanKarpinski commented Jun 4, 2015

tkelman commented Jun 4, 2015

StefanKarpinski commented Jun 4, 2015

mbauman commented Jun 4, 2015

tkelman commented Jun 4, 2015

StefanKarpinski commented Jun 4, 2015

tkelman commented Jun 4, 2015

mbauman commented Jun 4, 2015

mbauman commented Jun 5, 2015

tomasaschan commented Jun 5, 2015

mbauman commented Jun 5, 2015

mbauman commented Jun 5, 2015

timholy commented Jun 5, 2015

mbauman commented Jun 5, 2015

timholy commented Jun 5, 2015

tomasaschan commented Jun 5, 2015

mbauman commented Jun 5, 2015

mbauman commented Jun 5, 2015

timholy commented Jun 5, 2015

johnmyleswhite commented Sep 8, 2015

tkelman commented Sep 8, 2015

mbauman commented Sep 8, 2015

What is the official `similar` interface? #11574

What is the official `similar` interface? #11574