Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Channels #5757

Closed
wants to merge 2 commits into from
Closed

RFC: Channels #5757

wants to merge 2 commits into from

Conversation

amitmurthy
Copy link
Contributor

Considering the discussion in #5511 , this PR implements Channels on top of RemoteRefs.

The channel is a queue of type Any with a size limit. The only operations supported are put, take, fetch, isready and wait. The calls return immediately if the length is within the limit specified at construction time, else they block. If unspecified, the default Channel size is 1000.

Introducing Channels on top of RemoteRefs (as opposed to treating RemoteRefs itself as a Channel of size 1), allows us to possibly implement other non-Q messaging idioms in the future.

The remotecll* functions only work off RemoteRefs.

Channels allow producer and consumer tasks to be independent of each other, and are quite useful as a work distribution queue.

@pao
Copy link
Member

pao commented Feb 10, 2014

@amitmurthy Looks like the end of the pull request message got cut off ("Channels will allow...")

@amitmurthy
Copy link
Contributor Author

Updated. Thanks.

@JeffBezanson
Copy link
Member

The isa checks for ChannelQ seem pretty ugly.

Are these intended to be used between processes, or only between tasks? If the first, I think we should just expand and rename RemoteRef rather than have multiple layers. If the second, the implementation will need to be highly optimized.

@amitmurthy
Copy link
Contributor Author

They are intended to be used between processes.

Let me rework it so that RemoteRefs are defined as typealias RemoteRef Channel{1} , i.e., in Channel{N}, N specifies the length of the Channel queue. I like the name RemoteRef, it is pretty clear that it refers to a single object, and I would like to retain it if possible.

@JeffBezanson
Copy link
Member

I don't think we need different types for different channel sizes.

@StefanKarpinski
Copy link
Member

Does it really make sense to have the queue length in the type? Are we going to dispatch code differently or benefit from different generated code? I'm guessing the only case that might really be different would be Channel{1} (and maybe Channel{0} that would block on put – this is what zero-length channels do in Go). That said, I do very much like where this is headed.

@amitmurthy
Copy link
Contributor Author

OK.

Go also allows specifying the type of a channel, in which case our channels could be defined as Channel{T}. Do we want to support this? Users may dispatch differently based on the type of data in the channel.

@StefanKarpinski
Copy link
Member

That makes much more sense to me than having the channel size in the type, although I think the real benefit is probably not dispatching on the element type of the channel but having channels that can only transmit certain kinds of values. (Tangentially related to #5762).

@amitmurthy
Copy link
Contributor Author

RemoteRef has been renamed to Channel. Constructed with Channel(pid=myid(); eltype=Any, sz=1) Default type and size are the same as an existing RemoteRef, i.e., Any and 1 respectively.

Neither Channel nor the internal ChannelBuffer that holds the queue are parametric types. puting a type different from the one specified at channel creation time will still fail since the underlying Vector is created with the specific type.

Have to add more tests, update doc and possibly implement channel with sz 0, i.e., an unbuffered channel, where a put will block till a take is requested.

@cgestes
Copy link

cgestes commented Feb 11, 2014

On Tue, Feb 11, 2014 at 2:26 PM, Amit Murthy notifications@github.comwrote:

RemoteRef has been renamed to Channel. Constructed with Channel(pid=myid();
eltype=Any, sz=a) Default type and size are the same as an existing
RemoteRef, i.e., Any and 1 respectively.

I'am still perplex about the pid argument.

Why would a Channel specify a pid? Should'nt the design be generic enough
to allow sending datas between any kind of tasks? (being coro or process)

Neither Channel nor the internal ChannelBuffer that holds the queue are
parametric types. puting a type different from the one specified at
channel creation time will still fail since the underlying Vector is
created with the specific type.

Have to add more tests, update doc and possibly implement channel with sz
0, i.e., an unbuffered channel, where a put will block till a take is
requested.


Reply to this email directly or view it on GitHubhttps://github.com//pull/5757#issuecomment-34753641
.

@amitmurthy
Copy link
Contributor Author

It is similar to RemoteRef(pid) wherein the RemoteRef is created on the pid process. It allows you to specify which process actually owns the underlying ChannelBuffer. Channels are just the existing RemoteRef renamed. The minor difference being that the buffer size can be greater than 1.

Typically you would not specify the pid, you would use it as Channel() - for the existing RemoteRef behaviour, or Channel(eltype=Int, sz=1000) to specify the type and length. You can send data across any tasks (even across processes) like you can do with RemoteRefs.

@StefanKarpinski
Copy link
Member

While I think it's fine as a transitional API, I think @ctaf42's point stands – it seems like a fishy API. Why does it make sense for a RemoteRef or a Channel to keep track of which pid owns it? What does it mean to construct a channel owned by a different process than the current one? Under what circumstances would you do such a thing? What happens when a channel gets sent to another machine? Genuine questions here.

@amitmurthy
Copy link
Contributor Author

While a Channel gets serialized/deserialized across processes, the underlying ChannelBuffer that it refers to, is never sent anywhere from the machine where it is first created. The Channel object just holds a remote reference to the ChannelBuffer.

For example, in a DArray, all the remote references to the local parts point to the respective processes (which created and initialized the local parts).

@amitmurthy
Copy link
Contributor Author

This is one reason I would like to keep the RemoteRef name - it is very clear what it means, and use a Channel to represent a remote queue. Maybe we should just typealias RemoteRef Channel instead of deprecating it.

@StefanKarpinski
Copy link
Member

Ok, that makes sense. We can do that for a while, but it makes sense to only have one name.

@amitmurthy
Copy link
Contributor Author

Closing as same functionality has been implemented in package https://github.com/amitmurthy/MUtils.jl

@amitmurthy amitmurthy closed this Mar 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants