Add `squeeze(A)` method to squeeze all singleton dimensions #22000

staticfloat · 2017-05-21T03:47:07Z

This addresses #11838, which gave me hope that this might be accepted. I'm not sure I agree with the "the user may accidentally shoot themselves in the foot" argument; it seems to me that this function does something useful (that at least a few people want) and the behavior is very straightforward, and so isn't something to cause much surprise.

The only surprising thing about this implementation is that passing in an array with nothing but singleton dimensions returns a zero-dimensional array:

julia> z = randn(1,1)
1×1 Array{Float64,2}:
 -0.454977

julia> squeeze(z)
0-dimensional Array{Float64,0}:
-0.454977

Of course, this is what happens with the old squeeze as well:

julia> squeeze(z, (1,2))
0-dimensional Array{Float64,0}:
-0.454977

nalimilan · 2017-05-21T08:20:56Z

base/abstractarraymath.jl

+```
+"""
+function squeeze(A::AbstractArray)
+    singleton_dims = [d for d in 1:ndims(A) if size(A, d) == 1]


Maybe using a generator would be more efficient here?

Although it doesn't effect performance that much (we need to collect the generator output before passing it to squeeze(A, dims) in all cases) it is useless to create a temporary array, so I've done away with that, thanks.

Sacha0 · 2017-05-21T17:04:27Z

base/abstractarraymath.jl

+ 1  3
+ 2  4
+```
+"""


Perhaps add a warning re. the potential pitfalls of this method?

I'm not sure what I would write? This function is incredibly straightforward; it removes all singleton dimensions.

Looking back over the previous discussion, it looks like the concern is that you may pass in an NxMxL matrix expecting L to equal 1, but then being surprised when M also happened to equal 1 and thus the result is a 1-dimensional array. I'm not sure how to make the docstring more explicit that such a thing will happen; that is, after all, the entire point of this function.

StefanKarpinski · 2017-05-22T02:08:44Z

I still don't think we should have this – it is inherently type unstable and will be an inevitable cause of buggy corner cases. Instead, I've previously proposed squeeze(r, A, dims...) which would do squeeze(r(A, dims...), dims...) for you, where r is a reducer like sum.

staticfloat · 2017-05-22T06:42:44Z

The type instability is kind of the point; I have code that takes in a variable that on 0.5 ends up being a Vector, and on 0.6 ends up being an Nx1 Matrix. The operations I perform upon this variable were designed with a Vector in mind, and in a perfect world would work flawlessly upon both, but I don't want to have to go through all the library code that is explicitly expecting a Vector (often enforced with type annotations) and make sure it "does the right thing" when passed a Matrix that just so happens to look like a Vector. I want a function that takes in a Matrix, and if it can be made into a Vector, it makes it into a Vector. Yes this is inherently type-unstable, but as mentioned earlier, that's the whole point.

Half of the whole reason I'm proposing this is because the squeeze() function throws an error if you try to squeeze a dimension that doesn't exist. If it just returned the original vector when I tried to squeeze the nonexistant second dimension then I would be happy; but such a thing does not exist, so I am forced into code like this. Note that this is kind of the opposite of type-unstable; I am enforcing a common type upon varying inputs.

Personally, most of the reason I want this kind of a function is to enforce type stability, rather than create type instability. I want to accept Vectors, 1xN matrices, Nx1 matrices, etc.... and convert all of them to a Vector. Or I want to accept Matrices, NxMx1 tensors, etc..... and convert all of them to a Matrix. In a sense, I care less about what dimensions I'm starting with, and more about what dimensions I end with, so maybe we should attack this the other way around and come up with a smart way of squeezing with a deterministic return type, but in such a way that doesn't require me to perfectly determine the input dimensions so that I don't have to special-case on whether its a column vector or a row-vector, etc...

andyferris · 2017-05-22T09:54:55Z

This is exactly what vec is for. The output dimensionality is always 1 but it accepts any shape.

andyferris · 2017-05-22T10:00:56Z

(I'm beginning to labour this point - but since frequently Nx1 or 1xN matrices or row vectors are desired shapes in similar situations, functions for these in anology to vec might also be desirable... in general I feel functions which explicitly specify the output shape would be better than a generic squeeze whose behaviour can depend on dynamic size, leading to corner cases exposed at run time (and by a lack of inferability, of course))

timholy · 2017-05-22T10:37:17Z

@staticfloat, just to clarify: it's not possible to have squeeze(A) return a type that the compiler can infer---the sizes of the input matrix are a runtime variable, so there's no way for the compiler to know how many dimensions the output will have. So it can't possibly give you the type stability you seek.

In contrast, vec will give you a type-stable output, and is far clearer in any case.

martinholters · 2017-05-22T10:59:52Z

Maybe a squeeze-like function where dims denotes the dimensions to keep would be useful. E.g.

nn(A, (1,3)) == A[:,1,:] # or error if size(A,n) ≠ 1 for any n ∉ {1, 3}

By making duplicate dimensions in dims an error, that would be type-stable as the result would always have length(dims) dimensions. (We probably have this and I just missed it, right?)

staticfloat · 2017-05-22T18:03:32Z

This is exactly what vec is for. The output dimensionality is always 1 but it accepts any shape.

This is close, and I will tend to use it, but vec() will happily take in a 5x5 Matrix and turn it into a 25-length Vector. That's not really what I'm going for.

Perhaps what I really want is the inverse of #21977, so that I can just give a higher-order tensor to a Vector() constructor, and it will squash things down deterministically, and error out if the operation doesn't make sense, as in the 5x5 Matrix example.

mbauman · 2017-05-22T18:20:55Z

Hm, then maybe a decent spelling is something like squeeze(Vector, A) or squeeze(Matrix, A). Greedily remove singleton dimensions until you get to the specified number of dimensions (in the Array{<:Any}'s type). Type-stable and meaningful.

staticfloat · 2017-05-22T18:24:09Z

Yeah, I like that a lot. Let's define it to remove dimensions from right to left, so e.g. squeeze(Matrix, ones(1, 5, 1)) would yield a 1x5 Matrix.

nalimilan · 2017-05-22T19:39:21Z

It could also be useful for that function to add singleton dimensions when needed: squeeze(Matrix, x) would convert a vector to a one-column matrix. If we want that, then another name than squeeze should be found.

StefanKarpinski · 2017-05-22T19:55:01Z

convert?

nalimilan · 2017-05-22T19:59:54Z

convert would work, but currently it doesn't behave that way (and at #16633 people didn't seem OK with changing it).

andyferris · 2017-05-22T19:59:54Z

convert?

Having a wider range of conversions or constructors that follow this behavior certainly would solve this problem. Though I'm not sure specifying the type is exactly what you want - really you want something that hooks into the similar interface and has certain rules on how to get the targetted dimensionality from the input's size (or even better, indices).

mbauman · 2017-05-22T20:40:35Z

It's one thing to convert any and all iterables. It's another thing to convert something that's already an AbstractArray. In fact convert(AbstractMatrix, A) could use similar to allocate the output. I think this is well-defined enough to be a method of convert.

andyferris · 2017-05-22T23:18:56Z

Right - I agree, a method for convert(::Type{AbstractMatrix}, ::AbstractArray) would be great. The thing which has concerned me in the last few days (in a few issues/PRs open at the moment) is that I'm betting people will much more frequently write Matrix(v) rather than AbstractMatrix(v), making their code less generic than if we just made a matrix(v) function (which can return any AbstractMatrix) and made this the official interface for such conversions, for instance...

ararslan · 2017-05-25T22:03:33Z

For the sake of bikeshedding, one possible name for the inverse of squeeze(T, x) would be inflate(T, x). You squeeze out singleton dimensions, and you inflate by adding more singleton dimensions.

StefanKarpinski · 2017-05-26T16:05:34Z

inflate is a good name; unsqueeze might be more obvious.

timholy · 2017-05-27T11:13:11Z

Another option for all this might be specialized index types:

A[!, :, :] would be shorthand for @assert size(A, 1) == 1; reshape(A, (size(A, 2), size(A, 3))). This is inferrable.
A[-,:,:] would drop the first dimension if it can (not type stable, and A[-,-,-] would be like squeeze(A), perhaps could be abbreviated A[-])
b[+, :, +] would create a 3d array from a vector b.

This is completely doable since we can dispatch on function types.

ararslan · 2017-05-27T21:02:13Z

That's a really neat idea but it seems kind of obscure to me.

timholy · 2017-05-28T00:29:30Z

The main advantage is that it tells you which dimensions to "inflate." Say you want to convert a matrix into a 5-tensor, how do you specify which dimensions should be singletons?

andyferris · 2017-05-28T04:11:18Z

APL indexing a dimension with column/row matrices also inserts a singleton dimension. Maybe we can utilise this somehow?

mbauman · 2017-05-30T15:56:29Z

b[+, :, +] would create a 3d array from a vector b.

This is what numpy's newaxis() does, and we have a first-class equivalent for this one: [CartesianIndex()].

Interestingly, when I saw you passing functions to indexing as an email preview — without any other context — my first thought was that you might have been proposing a new idiom for specifying reductions. See #16606 for that digression.

This simple definition makes it easier to write reductions that drops the dimensions over which they reduce. Fixes #16606, addresses part of the root issue in #22000.

staticfloat · 2017-10-21T16:11:23Z

I'm going to close this in favor of #23500

This simple definition makes it easier to write reductions that drops the dimensions over which they reduce. Fixes JuliaLang#16606, addresses part of the root issue in JuliaLang#22000.

tkelman mentioned this pull request May 21, 2017

Don't extend base functions on base types FluxML/Flux.jl#24

Merged

nalimilan reviewed May 21, 2017

View reviewed changes

Sacha0 reviewed May 21, 2017

View reviewed changes

ararslan added the domain:arrays [a, r, r, a, y, s] label May 21, 2017

Add squeeze(A) method to squeeze all singleton dimensions

60f1c0c

staticfloat force-pushed the sf/squeeze branch from b91d0eb to 60f1c0c Compare May 21, 2017 22:48

mbauman mentioned this pull request May 30, 2017

array reductions (sum, mean, etc.) and dropping dimensions #16606

Open

mbauman mentioned this pull request Aug 29, 2017

Add squeeze(f, A, dims) for reductions to drop dims #23500

Closed

staticfloat closed this Oct 21, 2017

ararslan deleted the sf/squeeze branch October 21, 2017 20:32

timholy mentioned this pull request Oct 6, 2018

have dropdims default to dropping all singleton dimensions #29547

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `squeeze(A)` method to squeeze all singleton dimensions #22000

Add `squeeze(A)` method to squeeze all singleton dimensions #22000

staticfloat commented May 21, 2017 •

edited

Loading

nalimilan May 21, 2017

staticfloat May 21, 2017

Sacha0 May 21, 2017

staticfloat May 21, 2017

StefanKarpinski commented May 22, 2017

staticfloat commented May 22, 2017

andyferris commented May 22, 2017

andyferris commented May 22, 2017

timholy commented May 22, 2017

martinholters commented May 22, 2017

staticfloat commented May 22, 2017 •

edited

Loading

mbauman commented May 22, 2017 •

edited

Loading

staticfloat commented May 22, 2017

nalimilan commented May 22, 2017

StefanKarpinski commented May 22, 2017

nalimilan commented May 22, 2017

andyferris commented May 22, 2017

mbauman commented May 22, 2017

andyferris commented May 22, 2017 •

edited

Loading

ararslan commented May 25, 2017

StefanKarpinski commented May 26, 2017

timholy commented May 27, 2017 •

edited

Loading

ararslan commented May 27, 2017

timholy commented May 28, 2017

andyferris commented May 28, 2017

mbauman commented May 30, 2017

staticfloat commented Oct 21, 2017

+3
+4
+              ```
+              """

Add squeeze(A) method to squeeze all singleton dimensions #22000

Add squeeze(A) method to squeeze all singleton dimensions #22000

Conversation

staticfloat commented May 21, 2017 • edited Loading

nalimilan May 21, 2017

Choose a reason for hiding this comment

staticfloat May 21, 2017

Choose a reason for hiding this comment

Sacha0 May 21, 2017

Choose a reason for hiding this comment

staticfloat May 21, 2017

Choose a reason for hiding this comment

StefanKarpinski commented May 22, 2017

staticfloat commented May 22, 2017

andyferris commented May 22, 2017

andyferris commented May 22, 2017

timholy commented May 22, 2017

martinholters commented May 22, 2017

staticfloat commented May 22, 2017 • edited Loading

mbauman commented May 22, 2017 • edited Loading

staticfloat commented May 22, 2017

nalimilan commented May 22, 2017

StefanKarpinski commented May 22, 2017

nalimilan commented May 22, 2017

andyferris commented May 22, 2017

mbauman commented May 22, 2017

andyferris commented May 22, 2017 • edited Loading

ararslan commented May 25, 2017

StefanKarpinski commented May 26, 2017

timholy commented May 27, 2017 • edited Loading

ararslan commented May 27, 2017

timholy commented May 28, 2017

andyferris commented May 28, 2017

mbauman commented May 30, 2017

staticfloat commented Oct 21, 2017

Add `squeeze(A)` method to squeeze all singleton dimensions #22000

Add `squeeze(A)` method to squeeze all singleton dimensions #22000

staticfloat commented May 21, 2017 •

edited

Loading

staticfloat commented May 22, 2017 •

edited

Loading

mbauman commented May 22, 2017 •

edited

Loading

andyferris commented May 22, 2017 •

edited

Loading

timholy commented May 27, 2017 •

edited

Loading