Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notation for lazy map #19198

Open
cossio opened this issue Nov 2, 2016 · 59 comments · May be fixed by #31553
Open

Notation for lazy map #19198

cossio opened this issue Nov 2, 2016 · 59 comments · May be fixed by #31553
Labels
broadcast Applying a function over a collection

Comments

@cossio
Copy link
Contributor

cossio commented Nov 2, 2016

The notation f.(v) effectively maps f over an array v. This operation is not lazy, it materializes the mapped array.

I think it could be useful to have an analogous notation for lazy maps. Maybe f..(v)? Instead of materializing the mapped array, this should return something equivalent to the generator:

(f(x) for x in v)

For example, this could be useful to do something like:

sum(f..(v))

which is more efficient than materializing the intermediate array in:

sum(f.(v))

Of course right now sum takes an optional function argument, so one can write sum(f,v). But see the discussion here: #19146. If one decides to remove the method sum(f,v), I think the notation sum(f..(v)) could be a nice alternative.

@Sacha0
Copy link
Member

Sacha0 commented Nov 2, 2016

Ref. discussion around #16285 (comment) (generalizing compact broadcast syntax to compact higher-order function syntax, relevant given the behavior mentioned above seems like mapreduce). Best!

@JeffBezanson
Copy link
Member

That's the problem with special syntax; there's never enough of it. I think the dots are pretty obscure in this case. What might be nice, and has been proposed before, is f(2x + y) over x, which could expand to f(2x_ + y) for x_ in x.

@stevengj
Copy link
Member

stevengj commented Nov 2, 2016

f(2x + y) for x in x works and is not much more verbose than f(2x + y) over x, so I don't really see a big need for over.

(One annoyance is the precedence: shouldn't for have lower precedence than =? Currently, g = f(2x + y) for x in x gives a syntax error.)

@stevengj
Copy link
Member

stevengj commented Nov 2, 2016

In #16285 we discussed making sum..(f.(x)) (or maybe sum:(f.(x))) equivalent to sum(f, x). The advantage of this is that it would be fusing, e.g. sum..(f.(g.(x)) would be equivalent to sum(x -> f(g(x)), x).

@cossio cossio changed the title Punctuation for lazy map Notation for lazy map Nov 2, 2016
@cossio
Copy link
Contributor Author

cossio commented Nov 2, 2016

I did not know about #16285.

Anyway, I think I prefer lazy map than materializing map. IMHO, I'd prefer that f.(v) returns the generator:

(f(x) for x in v)

If I wanted to materialize it I could collect(f.(v)).

On the other hand, I cannot "unmaterialize" f.(v) if it returns an array. Hence I think lazy f.(v) is more general and could be more useful. Additionally, "fusing" is automatic with lazy f.(v).

@stevengj
Copy link
Member

stevengj commented Nov 2, 2016

f.(v) is a broadcast operation, which is more general in other ways than map or generators. (The functionality of map and broadcast intersect, but neither subsumes the other.)

It is an interesting question whether one could make a lazy broadcast that is as efficient as a fusing materializing broadcast. That is, suppose f.(x,y) returned an AbstractArray subtype that lazily computed its elements. I'm guessing that the answer in practice is "no", unfortunately.

For example, suppose that you cascade a sequence of these operations: z .= abs.(x.^2 .+ sin.(sqrt.(y .* 2))). Our fusing materializing broadcast can completely erase the overhead of the composition, and gives performance essentially identical to a handwritten loop overwriting z in-place. Instead, imagine that each of these operations created a lazy AbstractArray, so that indexing the right-hand side of the assignment cascaded through a sequence of lazy getindex calls. Although no large temporary arrays would be allocated, I think we could have a hard time convincing the compiler to inline and elide all of those getindex calls to eliminate the overhead.

@JeffBezanson
Copy link
Member

JeffBezanson commented Nov 2, 2016

Yep --- with laziness, performance variance tends to go up.

@cossio
Copy link
Contributor Author

cossio commented Nov 2, 2016

Pardon my ignorance, but is it possible to "fuse" the generators (as is done in the materializing broadcast)? That is, is it possible to fuse

abs.(sin.(v))

into

(abs(sin(x)) for x in v)

instad of

(abs(x) for x in (sin(y) for y in v))

Can this be done automatically by the compiler? Then the efficiency issues of the lazy f.(v) would go away, right?

@JeffBezanson
Copy link
Member

Of course; if broadcast were lazy the exact same fusing would still happen. I believe the efficiency concerns are, for example, recomputing when the same element is accessed twice. With the lazy version some code would get faster but some code could get slower.

@cossio
Copy link
Contributor Author

cossio commented Nov 2, 2016

My point is that with lazy f.(v) you can materialize if you want (collect(f.(v)), which is still pretty concise), but if f.(v) materializes, you are forced to use generators to get the lazy version.

It's a matter of taste then:

Either you prefer short syntax for materializing broadcast, and use long syntax for lazy map (generators) (This is the way it is now).

Or you prefer slightly larger syntax for materializing broadcast (collect(f.(v))) and short syntax for lazy broadcast (f.(v)).

@andyferris
Copy link
Member

andyferris commented Nov 4, 2016

I had been playing with the idea of explicit (higher-order?) function composition (#17184, https://github.com/FugroRoames/CoordinateTransformations.jl) and I find that creating compositions and then (possibly later) applying them to functions is an incredibly powerful way of doing "lazy" evaluation. The idea is that f ∘ g makes a new function (or function-like object) such that (f∘g)(x) = f(g(x)). It would be straightforward to create a similar operator, lets call it .∘ (probably bad syntax), which chains together broadcasted functions in a similar way ((f.∘g)(x) = f.(g.(x)))

This leads to a few interesting ways of doing things - you can (a) construct the entire chain of functions until the point you want the "full" output relatively easily and then call that, or (b) wrap these up in some kind of "lazy" broadcasting composition (maybe with another operator), or (c) perform static (or dynamic) analysis of the function chain to optimize it, in the same way e.g. SQL will optimize a query. I had started playing with this last idea but haven't had any time to commit to it - but in principle it would be a bit like Query.jl but without any macros.

I suspect that (b) is very close to what you want - and the great thing is that you don't need any further support from the parser, the compiler or Base to achieve it. Just chose one of the many available operators and have fun!

@stevengj
Copy link
Member

stevengj commented Apr 26, 2018

Now that we internally have a lazy broadcast object via #26891, it might be nice to have a syntax so that fused dot calls like x .+ f.(x) lower to the lazy wrapper without the materialize(...) call. cc @mbauman.

Available (non-breaking) syntaxes seem to include .(x .+ f.(x)), (for x .+ f.(x)), @lazy x .+ f.(x).

One could also make a macro @lazy or @lazydot that is analogous to @views in that it would make all dot calls in a block of code lazy.

In #16285, the syntaxes sum:(x .+ f.(x)) and sum..(x .+ f.(x)) were also suggested. Since : and .. are already infix operators, however, this could in principle be breaking.

@tkoolen
Copy link
Contributor

tkoolen commented May 3, 2018

I would also love something like run(bc::Broadcastable) (perhaps with special syntax), which acts like a 'broadcasted foreach' for functions with side effects. In contrast to materialize(bc::Broadcasted), which allocates and populates an output array, it would simply call the broadcasted function with the appropriate arguments while discarding the return values.

@yurivish
Copy link
Contributor

yurivish commented May 4, 2018

Perhaps semicolon syntax could be used as a sign that the expression is to be run for side effects alone: x .+ f.(x);

[edit: on second thought, I don't think this would be as simple as it seemed, since you could imagine saying a = x.+ f.(x); which would want to materialize]

@tkoolen
Copy link
Contributor

tkoolen commented May 4, 2018

A macro could be a simple solution, e.g. @discard f.(x, y, z).

@stevengj
Copy link
Member

stevengj commented May 4, 2018

@tkoolen, that seems to be subsumed by a non-materializing broadcast syntax, since then you could just use foreach

For example, if we adopt the (for ...) syntax for non-materializing broadcast, you could do foreach(for g.(f.(x))).

@mbauman
Copy link
Member

mbauman commented May 4, 2018

foreach(for g.(f.(x)))

That'd be a bit of a pun since Broadcasteds aren't callable… and if they were, I'd expect foreach(::Broadcasted) to be executed exactly once (like foreach(println)). We could probably look past that, but the technically correct version would be something like:

bc = (for g.(f.(x))) # or whatever
foreach(i->bc[i], CartesianIndices(axes(bc)))

@stevengj
Copy link
Member

stevengj commented May 4, 2018

@mbauman, my thinking is that the unmaterialized Broadcasted objects should be iterable so that you can pass them to anything expecting the iteration protocol.

(Right now, we don't have a single-argument foreach function, but it makes sense to me to have foreach(itr) be equivalent to foreach(identity, itr).)

@mbauman
Copy link
Member

mbauman commented May 4, 2018

Ah, yes of course. I'm blinded by sitting too close to the code. Broadcasteds aren't (yet) iterables, but let's make them so.

That said, we do have a zero-argument foreach:

julia> foreach(()->println("hello"))
hello

It's just the degenerate 0-arg case in a vararg, calling f once with no arguments. With an iterable bc::Broadcasted, I'd spell this foreach(identity, bc).

@tkoolen
Copy link
Contributor

tkoolen commented May 4, 2018

I think having Broadcasted objects be iterable makes a lot of sense regardless of this particular use case. However, having identity as the first argument of foreach is a little strange in my opinion. In the case of f.(x, y, z) I'd expect the f to be the first argument. I guess you just have to use something like the identity version if you need full support for recursive broadcasting, but maybe there's a way to make the non-recursive case more visually appealing? In this case, you could also have

foreach(args -> f(args...), @lazy tuple.(x, y, z))

which, although not that pretty, at least puts the f in the 'right' position. If there were a way to just have f as the first argument, that would be preferable in my view, as it would work nicely with do syntax.

@tkoolen
Copy link
Contributor

tkoolen commented May 4, 2018

I guess maybe with the new tuple destructuring, do syntax wouldn't be so bad:

foreach(@lazy tuple.(x, y, z)) do (a, b, c)
    println(a + b * c)
end

@stevengj
Copy link
Member

stevengj commented May 5, 2018

There is nothing preventing us from implementing a single argument version of foreach for Broadcasted objects. Or a different function. That issue is far down on my priority list because it requires no language changes or breakage.

The first question is how to get a non-materializing broadcast; I’m partial to the for syntax because it is terse and reminiscent of generator syntax. But it’s far from the only choice.

The other prerequisite is to make Broadcasted objects iterable.

@vtjnash
Copy link
Member

vtjnash commented May 5, 2018

The other prerequisite is to make Broadcasted objects iterable.

Seems vaguely reminiscent of #18618? And since we're opting towards defining broadcast in terms of iteration (#26435, akin to map), this seems reasonable. Is there anything I'm missing (e.g. do we need triage this issue)?

@mbauman
Copy link
Member

mbauman commented May 7, 2018

There are three different aspects of Broadcasted's "iterability." Internal implementation, exposed API, and supported arguments. My current plan is to:

  • Base the Broadcasted internal implementation on indexing — status quo
  • Support an iterable API (iteration over Broadcasted) simply by proxying to getindex — that's Make Broadcasted iterable and more indexable #26987.
  • Support iterable arguments simply by collecting them before use — after 0.7 deprecations are removed.

While it feels like we should be able to refactor Broadcasted to work purely based upon iteration alone (and not indexing), I think it'd be really hard and possibly impossible for some combinations of iterable arguments. Doing so performantly is even worse. We'd have to very carefully structure the order of iteration and nested loops so we never need to backtrack up dimensions. Also it would change the semantics of broadcast. For example, with a pure iteration implementation, in f.(1:3) .+ g.([1 2]) .+ h.() I'd expect f to execute three times, g to execute twice, and h to execute once. As it stands, all three functions execute 6 times. Folks rely on this behavior for things like A .*= rand.(). So I'm content to leave this alone.

@stevengj
Copy link
Member

stevengj commented May 11, 2018

Now that broadcasted objects are iterable, we should bikeshed a syntax for a non-materializing dot call. Just to start things off, maybe a non-binding straw poll of some non-breaking options (which eliminates f..(x)):

  • 👍 for for f.(x) (much like the current x for y in z generator syntax)
  • 🎉 for .(f.(x))
  • ❤️ for @lazy f.(x) (collides with Lazy.jl) or @lazy f(x) (acting like @.)
  • 😕 for none of the above

@cossio
Copy link
Contributor Author

cossio commented May 11, 2018

wouldn't for f.(x) collide with the syntax for an actual for loop?

@mcabbott
Copy link
Contributor

mcabbott commented Feb 6, 2019

Maybe the right way to do this is for air.(...) to materialize into something <: AbstractArray. And the simplest way to do that would be to re-use the BroadcastArray, which I tried in this PR: JuliaArrays/LazyArrays.jl#21

@tkf
Copy link
Member

tkf commented Feb 7, 2019

Can we use a macro @: for non-materializing broadcast? It seems this is syntactically invalid at the moment. But I suppose making it valid is not breaking? It's nice since it's short and does not introduce a new concept in the syntax (I think?).

@tkf
Copy link
Member

tkf commented Feb 16, 2019

I created a patch #31088 to construct a broadcasted object with @: to see if adding it does not break the current syntax in Julia. Combined with #31020, we can do sum(@: x .* y) to compute the dot product which is almost as fast as LinearAlgebra.dot but is more flexible.

@nalimilan
Copy link
Member

Thanks for pushing this forward!

Regarding the syntax, I'm not sure I like @:. It looks a bit like @., yet contrary to it it doesn't turn function calls into dot-calls: it only affects existing dot-calls. And : isn't related to broadcasting in Julia, but to ranges. Above it was suggested to use @lazy, which is more explicit. I've also proposed adding this meaning to @views, given that a Broadcasted object is a kind of view on its inputs (i.e. it avoids copies and modifying the inputs will affect the result).

@tkf
Copy link
Member

tkf commented Feb 18, 2019

Thanks for the comments!

And : isn't related to broadcasting in Julia, but to ranges.

I'm seeing @: to be related to quotes rather ranges. That is to say, :(expression) "guards" against evaluation of the expression and @:(expression) "guards" against materialization. @: has two dots in it so it's not hard to relate to the dot calls.

Above it was suggested to use @lazy, which is more explicit.

I think explicit is great but I don't think long means good. For example, lowered form of broadcasting expression is quite long but it's harder for humans to understand. I think @: is concise enough to not distract the important part of the code but yet salient enough to visually signify the "unusual" part.

Also, I think @lazy is too general term because there are too many lazy things in programming concepts. It's already noted that the name crashes with the one from Lazy.jl. It's better to have more specific notation.

I've also proposed adding this meaning to @views, given that a Broadcasted object is a kind of view on its inputs (i.e. it avoids copies and modifying the inputs will affect the result).

I think this is a great idea because it can be applied to multiple broadcasting expressions in a block. At the same time, it's also nice to have a single-purpose macro like @view.

@mbauman
Copy link
Member

mbauman commented Feb 20, 2019

This is great, thanks for tackling it @tkf! On the name, I find @: to be quite clever. Perhaps it is too clever by half, but I do like it. Adding this behavior to @views would unfortunately be breaking — lots of folks use @views in conjunction with dot fusion to work around the lack of a broadcasted getindex, and this would drastically change the return type to something that's no longer even an AbstractArray.

@cossio
Copy link
Contributor Author

cossio commented Feb 20, 2019

Some ideas:@,, @gen, @...

@chethega
Copy link
Contributor

FWIW, the three macros in that family @:, @., @views play well together: sum(@: @. @views a[I] * b[I]) works as expected in any of the six permutations.

@tkf
Copy link
Member

tkf commented Feb 21, 2019

Adding this behavior to @views would unfortunately be breaking

Good point. It has to be a new macro if we were to do this. But people can experiment with the interface in libraries if we have good supports for lazy broadcast.

@tkf
Copy link
Member

tkf commented Feb 21, 2019

Another motivation for this notation is that it is useful for defining "broadcastable call overload": https://discourse.julialang.org/t/best-practice-for-broadcastable-callables/21000

@tkf tkf linked a pull request Mar 30, 2019 that will close this issue
@tkf
Copy link
Member

tkf commented Mar 30, 2019

The new syntax for f.(args...) proposed by triage #31088 (comment) is implemented in #31553

@cossio
Copy link
Contributor Author

cossio commented Apr 29, 2019

Another solution is to implement an ArrayReducer struct which implements the reducing operation, but acts as if it had an appropirate size. See https://discourse.julialang.org/t/is-there-something-like-broadcast-mapreduce/6076/15 (I think this is very clever, is that you @meggart?). Then you can just call the ordinary broadcast! on it. I think it's an option worth considering.

@tkf
Copy link
Member

tkf commented Apr 30, 2019

We "just" have to decide the syntax (which is probably always the hardest part...). Unfortunately, it seems that there is no consensus in Triage #31553 (comment). I tried to push the discussion by implementing proof-of-concepts but I'm "out of bullets" ATM.

Regarding the implementation of broadcasted reduction, there is already #31020. I believe specializing mapreduce (and mapfoldl) is the right direction to go (rather than "hacking" broadcast! as done in ArrayReducer, IIUC).

@meggart
Copy link
Contributor

meggart commented Apr 30, 2019

I would second that. The ArrayReducer hack was done before the big 1.0 broadcast changes, which solved my immediate problem so I agree that it would not add anything significant here, and in particular #31020 is much more powerful.

Good luck anyway in finding a syntax for the lazy map, I would be eager to have it in Julia Base as well.

@StefanKarpinski
Copy link
Member

I think we do want something here, but this is a pretty deep feature and I think it doesn't hurt to let the idea simmer in the collective consciousness for a while. I'm just writing this to make it clear that people aren't against this, nor are we stonewalling, this is just the kind of feature that requires some significant rumination so that you don't end up baking in something that is good but not the best.

@jtrakk
Copy link

jtrakk commented Jun 27, 2021

Gilad Bracha has developed a new language that generalizes array broadcasting into n-dimensional streams.

ShapeRank is new programming language under development at F5. ShapeRank is targeted at data analytics, machine learning and reactive programming. ShapeRank is purely functional and statically typed. All ShapeRank values are multi-dimensional streams, and all operations are automatically lifted to process such streams in parallel. This lifting originates in the APL language family, and is known as rank-polymorphism. ShapeRank is unusual in that it extends rank-polymorphism to streams.

Interactive in-browser introduction | Talk video | GitHub

I really like the idea of non-materializing broadcast fusion over infinite streams like g.(f.(channel)). I see a lot of potential for live interactive programming on incoming data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection
Projects
None yet
Development

Successfully merging a pull request may close this issue.