-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal For An Actor System Based On Mojo #1445
Conversation
Signed-off-by: Reid Spencer <reid-spencer@users.noreply.github.com>
This is cool reid, thank you for putting this together. We're quite a bit too early to invest in this area IMO (we need to get traits much further along and complete lifetimes) but I think this is a very likely long term direction. If you're interested, Actors got built into swift with a more complex model than was in the manifesto, you can read about it here, or on several swift-evolution proposals: I do hope we can eschew the complexity, a lot of it is due to legacy interop with apple frameworks. OTOH we may need such things to work with legacy python and other libs though. |
@lattner - I understand the language earliness, but I think there's value in starting the Actor's project early. So, I have started already: https://github.com/ossuminc/moxy (extremely nascent). In the proposal, I've tried to minimize the requirements on Mojo. The recent introduction of Traits allowed me to get started. All else that is needed is default implementations in the traits. Later on, when Mojo has matured, it would be interesting to integrate an ASIC or GPU to help with extremely fast message dispatch. All in good time. I'm happy to start this work without the involvement of Modular's time/resources; at least for now. Thanks for the reference to the Actor implementation in Swift. I am examining several actor systems to try and glean the winning strategies from their patterns. Akka is my strongest entry knowledge, but I'm open to merging the best ideas from other ecosystems. I plan to leave interoperability to the end of actor development and not sacrifice simplicity or performance. In other words, interoperability will have its own complexity and costs, as an add-on. |
Senders and ReceiversI have a suggestion for the mojo-features-needed.md concurrency section: senders and receivers, a.k.a. Basically, it's a universal abstraction to express all concurrency and parallelism without the need for locks. One of p2300's authors, Eric Niebler, calls them "lazy futures"2. For a more technical explanation of the work that it builds on (i.e. delimited continuations3 and monads), you can read p2300 itself and/or read this article. This presentation is also helpful. Of particular note are the theoretical results in the paper p2504 - Computations as a global solution to concurrency (The paper refers to senders as 'computations'). The findings are summarized in this article: Lucian Radu Teodorescu summarizing p2504
This article states many advantages of senders and receivers: Lucian Radu Teodorescu on the advantages of senders and receivers
Senders and receivers enable structured concurrency. Using them, you can make higher level abstractions for lockless concurrency and parallelism:
So we can use senders and receivers to implement the actor system, while allowing users to build any concurrency abstraction they need without the need for locks. I think this aligns very well with Mojo's goals of providing sensible defaults and high level abstractions while exposing as much as possible as libraries so users can easily build their own abstractions. Lucian Radu Teodorescu praising S&R
Niall Douglas praising S&R
Eric Niebler defending S&R
It's not all sunshine and rainbowsAs explained in this video, senders and receivers have a lot of advantages, but the C++ implementation has some flaws too: Safety
Performance
Niall Douglas on the disadvantages of universal generic async APIs
...and more recently here: Niall Douglas on how p2300 senders and receivers aren't good enough for extremely low latency applications
Tsung-Wei Huang's opinion of senders and receivers
This is referring to features like conditional tasking. After a bit of digging, this may not be true: Eric Niebler on how to conditionally change the execution pipeline at runtime
Discussions on how type erasure affects performance of senders and receivers
Ease of Use
If Mojo can implement senders and receivers in a better way by avoiding these pitfalls, I think it will have the best concurrency model out of all programming languages. Additional ReadingThe 'Resources' section of Nvidia's reference implementation. FuturesAlso for the futures and promises section, you can look at the STLab concurrency library for inspiration: STLab futures description
Channels are this library's equivalent of the Communicating Sequential Processes (CSP) model. P.S.: It seems the STLab concurrency library may get actor and sender/receiver5 implementations of its own. P.P.S.: Sean Parent on sender/receiver: Sean Parent on senders/receivers
And here from 10:40 to about 14:30 - ReactorsI recently learned of an alternative to actors that gets rid of non-determinism. They're called reactors. More papers and stuff here. Footnotes
|
Going to close this PR for now since there hasn't been much activity on this in several months. Feel free to reopen when we're ready to take on this kind of work. |
@Brian-M-J I've looked into the senders and receivers proposal. I see people promising big things ("a global solution to concurrency"), but notably, they don't seem to be able to back up their claims with evidence. I would expect to see an example of a non-trivial concurrent program that is dramatically easier to write with senders as opposed to being written with async/await and tasks etc. All I see are toy examples. I am really skeptical that something big has been discovered. If it had, more people would have noticed by now. Senders and receivers have been around for 4+ years. |
I'd be able to show it to you if it was open source 🙂. In terms of applications, I'd say the most prolific users of S&R are at Meta. In terms of libraries, there's plenty of open source stuff out there (see users of HPX). I guess talks like this would be good demonstrations at least. I guess another thing to note is that S&R is meant for high performance, so the C++ implementation might not be the most beautiful library. Starting from a blank slate (Hylo, Mojo (somewhat)) means that some syntactical niceties can be added to make it easier to use. Edit: I just found this Reddit post that may be of help to future readers: Any real life examples of P2300 senders/receivers?
As you noted, S&R is relatively new, so it isn't widely used yet. Though I wouldn't say that only a few people have noticed it. Clearly Nvidia has noticed, because they host the reference implementation. Meta has noticed it, because they've been using it in production for years. Bloomberg has noticed, because a few of their employees are authors of the proposals. Adobe has noticed, because there are plans to remodel the stlab library in terms of S&R, and Hylo is supported by the company. The users of libraries like HPX, folly etc. have noticed. The C++ committee has noticed, because they've voted favorably on it multiple times. |
That's another toy example. All he's done is create an event loop that spawns asynchronous tasks one-at-a-time. This is trivial to do in any language with async/await and a I would love if S&R has solved some major problems with modelling concurrent systems, but I don't see it. |
At the risk of stating the obvious, there are a lot of projects out there developed by well-meaning, passionate people, who promise that they have created something important. But most of the time, that doesn't turn out to be the case. I've been burned a lot in the past by believing that a project is as important as the contributors say it is, and then I've begun to experiment with it, only to eventually discover that I've wasted my time. |
Well, the only thing I can show you when I don't have access to the code is this: Hylo has gotten rid of function colouring (see here). BTW when they say "no I myself am working on a Mojo implementation of S&R based on this talk in a private repo. That way when Mojo's coroutines drop with some benchmarks, I'll have something concrete to compare to. The biggest problem is that I don't know what I'm doing 😅. I'm not a low level / library expert or anything. |
Getting rid of function coloring is a worthy goal, no doubt about that. But that is orthogonal to S&R. The reason most PLs have colored functions is because the thread-based concurrency model that they had already implemented prior to implementing async/await is incompatible with implicit suspension and implicit migration of tasks between threads. In the case of Python, the main reason you can't implicitly switch tasks is because a Python program is a big soup of shared mutable state with no synchronization/critical sections, so two tasks can easily race each other. The solution to this is to come up with a concurrency model that ensures tasks can't race each other. You can get most of the way there with a Rust-like borrowing system. On top of that, you'd want a way to perform transactions on shared state. This is a big design space worth exploring. S&R doesn't really have a solution here. (I'd like to see an example of multiple tasks concurrently printing to |
I've seen that. The idea is that if you can statically identify all of the places in your codebase where a variable is being accessed by multiple tasks—and if at least one of those tasks mutates the variable—then you (or your compiler) can conceivably restructure the program (cut it up into subtasks) such that any time a task needs to access the shared variable, you defer the access to a scheduler/executor. (And the executor contains the synchronization primitives required to avoid data races.) This is a good observation, and I strongly agree that a task-based concurrency model should aim to do this. (Concurrency without explicit locking would be amazing!) But this is—again—completely orthogonal to S&R. S&R doesn't give me a simple way to write that restructured program, from what I can tell. |
Actually, I'd only read the second article you linked, but the first article is more interesting IMO, because it actually discusses the "program restructuring" problem I'm referring to:
This proposed solution—forking a new task to mutate the shared variable—makes a lot of sense. It's worth exploring further. |
@nmsmith maybe take a look at how NVIDIA leverages P2300 with CUDA to do async computation on GPUs? https://www.youtube.com/watch?v=nwrgLH5yAlM |
This is currently a work in progress. There are no code changes, just a proposal written in the proposals section. This was pre-approved by Chris Lattner in a conversation in June 2023.
I will keep working on this as I have time, but it is far enough along that I'm looking for feedback and assistance from interested parties.
I will take it out of draft mode when its a little further along.
Signed-off-by: Reid Spencer reid@ossuminc.com