Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should the shape of zipped shaped iterators be? #47195

Open
jakobnissen opened this issue Oct 17, 2022 · 3 comments
Open

What should the shape of zipped shaped iterators be? #47195

jakobnissen opened this issue Oct 17, 2022 · 3 comments
Labels
iteration Involves iteration or the iteration protocol

Comments

@jakobnissen
Copy link
Contributor

jakobnissen commented Oct 17, 2022

Currently, a zip of HasShape{N} iterators is itself HasShape{N}:

julia> z = zip([1 2], [1 2]');

julia> Base.IteratorSize(z)
Base.HasShape{2}()

However, the documentation of IteratorSize states that:

HasShape{N}() if there is a known length plus a notion of multidimensional shape
(as for an array). In this case N should give the number of dimensions, and the
axes function is valid for the iterator

Despite z being HasShape, axes is not valid, contrary to the docstring.

julia> axes(z)
ERROR: DimensionMismatch: dimensions must match: a has dims (Base.OneTo(1), Base.OneTo(2)), b has dims (Base.OneTo(2), Base.OneTo(1)), mismatch at 1

This error in the implementation means that some valid zips, like the above cannot be collected, because while it's a valid iterator, its shape is nonsensical:

julia> collect(z)
ERROR: DimensionMismatch: dimensions must match: a has dims (Base.OneTo(1), Base.OneTo(2)), b has dims (Base.OneTo(2), Base.OneTo(1)), mismatch at 1

I can't see why this shouldn't work. And if it should, I don't see a way a user could make this work, either. At this time, Base uses zip heavily internally, for example in the implementations of map and Generator, which are documented to preserve shape.

I'm not sure what is the right way to go about this, but I propose:

  • zip should not preserve shape, but simply be either SizeUnknown, IsInfinite or HasLength.
  • This implies shape-preserving operations such as map should not rely on zip.
@N5N3 N5N3 added fold sum, maximum, reduce, foldl, etc. iteration Involves iteration or the iteration protocol and removed fold sum, maximum, reduce, foldl, etc. labels Oct 17, 2022
@mcabbott
Copy link
Contributor

mcabbott commented Oct 18, 2022

Xref #42216... edit: mostly about map, but has links to older discussions about zip.

@jakobnissen
Copy link
Contributor Author

Let's leave the map discussion for another issue. Zip and map need not behave the same, so these issues are orthogonal.

@lxvm
Copy link
Contributor

lxvm commented Jan 16, 2024

I agree that zip should have a length when the shape is inconsistent, or simply be consistent with the shape of collecting zip.

julia> z = zip(1.0, 1:3)
zip(1.0, 1:3)

julia> collect(z)
1-element Vector{Tuple{Float64, Int64}}:
 (1.0, 1)

julia> size(z)
ERROR: DimensionMismatch: dimensions must match: a has dims (3,), must have singleton at dim 1
Stacktrace:
 [1] promote_shape
   @ Base ./indices.jl:162 [inlined]
 [2] promote_shape
   @ Base ./indices.jl:153 [inlined]
 [3] _promote_tuple_shape
   @ Base.Iterators ./iterators.jl:394 [inlined]
 [4] size(z::Base.Iterators.Zip{Tuple{Float64, UnitRange{Int64}}})
   @ Base.Iterators ./iterators.jl:390
 [5] top-level scope
   @ REPL[111]:1

When the shape and axes are consistent I could see why it is useful to to preserve the shape, although that doesn't appear to be enforced for 1d iterators.

julia> z = zip(ones(2,), ones(3,))
zip([1.0, 1.0], [1.0, 1.0, 1.0])

julia> collect(z)
2-element Vector{Tuple{Float64, Float64}}:
 (1.0, 1.0)
 (1.0, 1.0)

julia> z = zip(ones(2,), ones(3,3))
zip([1.0, 1.0], [1.0 1.0 1.0; 1.0 1.0 1.0; 1.0 1.0 1.0])

julia> collect(z)
2-element Vector{Tuple{Float64, Float64}}:
 (1.0, 1.0)
 (1.0, 1.0)

julia> z = zip(ones(2,2), ones(2,2))
zip([1.0 1.0; 1.0 1.0], [1.0 1.0; 1.0 1.0])

julia> collect(z)
2×2 Matrix{Tuple{Float64, Float64}}:
 (1.0, 1.0)  (1.0, 1.0)
 (1.0, 1.0)  (1.0, 1.0)

julia> z = zip(ones(2,2), ones(3,3))
zip([1.0 1.0; 1.0 1.0], [1.0 1.0 1.0; 1.0 1.0 1.0; 1.0 1.0 1.0])

julia> collect(z)
ERROR: DimensionMismatch: dimensions must match: a has dims (Base.OneTo(2), Base.OneTo(2)), b has dims (Base.OneTo(3), Base.OneTo(3)), mismatch at 1
Stacktrace:
 [1] promote_shape
   @ ./indices.jl:178 [inlined]
 [2] _promote_tuple_shape
   @ ./iterators.jl:394 [inlined]
 [3] axes(z::Base.Iterators.Zip{Tuple{Matrix{Float64}, Matrix{Float64}}})
   @ Base.Iterators ./iterators.jl:391
 [4] _similar_shape
   @ ./array.jl:711 [inlined]
 [5] _collect(cont::UnitRange{Int64}, itr::Base.Iterators.Zip{Tuple{…}}, ::Base.HasEltype, isz::Base.HasShape{2})
   @ Base ./array.jl:765
 [6] collect(itr::Base.Iterators.Zip{Tuple{Matrix{Float64}, Matrix{Float64}}})
   @ Base ./array.jl:759
 [7] top-level scope
   @ REPL[118]:1
Some type information was truncated. Use `show(err)` to see complete types.

Since multi-collection map uses zip, it seems unhelpful to break the existing behavior, but I still think it would be nice to have size(zip(...)) == size(collect(zip(...)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iteration Involves iteration or the iteration protocol
Projects
None yet
Development

No branches or pull requests

4 participants