Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next big C# language should focus on supporting building distributed system and concurrency programming #502

Closed
asydneylover opened this issue Apr 25, 2017 · 87 comments

Comments

@asydneylover
Copy link

The next version of C# language should be focus on making it easier in building distributed system and concurrency programming instead of introducing more syntax sugar. That's the only way make C# really a big programming language in comparing with the others such as Java.
One of the feature I think is a native support to implement Golang's co-routines.

@HaloFour
Copy link
Contributor

C# already has asynchronous coroutines in the form of async/await. How would implementing goroutines improve on that?

Actual distributed programming can already be solved through libraries such as Akka.

@smoothdeveloper
Copy link
Contributor

@uyhung what feature of Java do you feel makes it more suitable than C# in this use case? I'd think it is more about libraries or middleware than language level.

@sharwell
Copy link
Member

@uyhung I'm curious if you mentioned Java because of experience working on distributed/concurrent programming in that environment, or if you just listed it as an example. As someone who spent a great deal of time on asynchronous and concurrent (but not so much distributed) programming in Java and C# over the past several years, I came to essentially the opposite conclusion.

@HaloFour
Copy link
Contributor

HaloFour commented Apr 25, 2017

Indeed, Java's advantage here is, at best, ecosystem. The language and JDK, if anything, actively fight you at every step.

Having to maintain both a C# WebAPI site as well as a JAX-RS site the experience isn't even remotely comparable. It doesn't help that J2EE is still stuck in Java7 land which is all blocking on futures and callback hell. The experience is somewhat less awful using alternate JVM languages, but not by much. To my knowledge Spring has some support for coroutines via AOP and bytecode rewriters but only at the method level. There's also EA's Orbit library which does something similar. I'm actively working to incorporate the latter into my projects because I find it appalling to have to manually write callback/lambda continuations anymore, but none of that even remotely comes close to just how well async/await works out of the box. I expect Java to copy it wrong by Java 10 or so, in a few decades from now.

Funnily enough probably the best concurrency library on the JVM right now is RxJava. It irks me to no end that Microsoft's own framework is getting more love there than at home.

@benaadams
Copy link
Member

benaadams commented Apr 25, 2017

@uyhung This should do the same thing as the "A Tour Of Go: Goroutines" example does

using System;
using System.Threading.Tasks;

public class Program
{
    public static async Task Say(string s)
    {
        for (var i = 0; i < 5; i++)
        {
            await Task.Delay(100);

            Console.WriteLine(s);
        }
    }

    private static async Task MainAsync(string[] args)
    {
        var go = Task.Run(() => Say("world"));
        var t = Say("hello");

        await Task.WhenAll(go, t);
    }

    public static void Main(string[] args)
    {
        MainAsync(args).Wait();
    }
}

edit improved example here #502 (comment)

@AlgorithmsAreCool
Copy link

@uyhung Here are a couple of things that you might find interesting:

Corefxlab Channels (prototype)

Orleans

TPL Dataflow

@asydneylover
Copy link
Author

@sharwell @smoothdeveloper ,
I had 7 years experienced working with C#, then the last two years with Java. And yes, in term of language design C# is far better than Java, especially Java's Generic is implemented so bad. You're correct, Java as a language itself does not have any special features to support better concurrency / distributed system, but it's ecosystem is amazing . In Java they have Akka, Karaf, Kafka, Loghom, Vert.X, Spark, Elastic Search, Hadoop, Spring etc. Even Rx is originally created in our .NET world but now the world seems only know the existence of Rx Java, people even believe that Rx.NET is ported from RxJava :)

Hopefully that Microsoft can find a way to make .NET Core a really big ecosystem.

@HaloFour
Copy link
Contributor

HaloFour commented Apr 28, 2017

@uyhung

This repo is specifically for changes to the C# language. If you want to make specific suggestions for inclusions in the BCL to support distributed systems you can try the CoreFX repo. But unless you're expecting Microsoft to build out that ecosystem on their own I'm not sure what you're expecting to happen. Microsoft is providing the environment and trying to expand it's reach. It's up to others to populate the ecosystem. It's not like Sun/Oracle have anything to do with any of those projects that you've mentioned. If anything they flourished due to the massive gaps in functionality left by the JDK.

@benaadams
Copy link
Member

You can specify you own custom Task scheduler it just uses the ThreadPool by default to maximise the CPU usage

@HaloFour
Copy link
Contributor

@sgf

It sounds like you don't understand how async/await works. It has nothing at all to do with threading. It is purely user-mode and is completely agnostic to whatever switching mechanism is employed, if one at all. It is completely disconnected from Task, especially in C# 7.0 where even the requirement to use Task as a proxy return value has been lifted. There's no requirement to use a ThreadPool or any of the rest of the TPL.

async/await is very elegant and very simple. It serves as the prototype for coroutines for a number of other languages, including JavaScript, Kotlin and C++.

@yaakov-h
Copy link
Member

yaakov-h commented May 1, 2017

@sgf async/await does not always use the threadpool and does not always switch threads, or other contexts. You have a lot more control than you think.

@HaloFour
Copy link
Contributor

HaloFour commented May 1, 2017

@sgf

Do you actually have a concrete proposal to improve the language or are you just here to complain that you don't like C#? If you don't care to listen to how a feature that you are disparaging incorrectly actually works then there is little reason for anyone to attempt to hold a dialog with you.

I have nothing against Go's Goroutines. I think that they work quite well in that language. However Go was designed very explicitly around its threading/synchronization model with its baked-in user-mode switching and kernel-mode backing of fibers. I don't believe that attempting to add such to an existing language/framework would be met with as much success since nothing that already exists would understand how the cooperative multithreading would work nor would they understand how the fiber mechanism would work. Any existing code that blocked the kernel thread would block every Goroutine scheduled to be handled by that thread. That is the issue with fiber models in general unless you build everything around that from day one.

@sharwell
Copy link
Member

sharwell commented May 9, 2017

@sgf I can say with very high confidence that there are many people working on or around C# for whom performance is a very important priority across a wide variety of scenarios. For example, there are people like @stephentoub (and to a lesser degree myself) who are specifically interested in improving behavior of multithreaded applications through smarter algorithms (e.g. lock-free approaches), and people like @benaadams who never let us hear the end of it when anything is slow (he's a fun guy to have dinner with and talk about what can be better!). We also have people like @gafter and @jaredpar working on the language itself to both enable and encourage practices with the C# language that lead to better performing and more reliable applications for end users.

This is not to say we're the best. Of course I may have an opinion on that, but it's not the point I'm trying to make. What I am saying, is if you believe you've found specific areas where improvements to the language, the runtime, or the tooling will enable end users to have a better-performing product, there will be multiple people eager to consider the ideas regardless of which subsystem they lie in. Just remember that the ecosystem is very large so even when people are working with you to make a change it can take a bit of time to actually ship out. 👍

@HaloFour
Copy link
Contributor

HaloFour commented May 9, 2017

@sgf

Again, are you planning on actually proposing something here? Your comments do nothing to continue the discussion. In fact I'd go so far to say that they simply demonstrate that you don't understand what you're talking about.

native,coroutine is the current mainstream technology.

C# has native coroutines. That is what async/await and yield are. It is becoming mainstream in that other languages are copying it.

No, they're not the same as goroutines. Only Go has goroutines. The only reason they work in Go is because Go was designed very specifically around them. You cannot take fiber-based cooperative multitasking (a very old concept, by the way) and hack that into existing languages while protecting it from native threading concerns and blocking (which has devastating consequences). You just can't.

Java is also progressing more good.

How? What does the language offer you here that C# doesn't? Nothing. Absolutely nothing. From a language and framework point of view Java is well over a decade behind C# in terms of distributed and concurrent programming.

@mattwar
Copy link
Contributor

mattwar commented May 9, 2017

If anyone has any ideas on how to make the language friendlier to distributed and concurrent programming feel free to speak up.

@HaloFour
Copy link
Contributor

HaloFour commented May 9, 2017

@mattwar

#88
Support for IObservable<T> in #43
Possibly some kind of async await support in match or with pattern matching in general

@mattwar
Copy link
Contributor

mattwar commented May 9, 2017

#88 is somewhat interesting because it removes some of the boiler plate of writing and coordinating event handling logic. Like the inverse of async/await. Not sure what it means to keep all those parameter values around between separate API invocations, and what happens to the object's state when API's calls are not made in the expected order, or the concurrency of the object itself with regards to overlapping calls, etc.

Maybe the object is generated with locks that only let new invocations through when the previous initiated sequences are complete. Or maybe if the API's are async, they wouldn't need to block.

Interesting to think about.

@HaloFour
Copy link
Contributor

@mattwar

The point is to avoid all blocking. The mailbox methods always return instantly, and the chorded methods are always async. The parameters captured from the mailbox methods are available in the chorded methods in the order in which they were invoked. They're pretty similar to BufferBlock<T> using the Post and ReceiveAsync extension methods.

Where it gets interesting is when multiple mailbox methods are used with multiple chorded methods. As I describe you can do this with Dataflow today using JoinBlock<T1, T2>s but of course it requires a lot more boilerplate code.

@mattwar
Copy link
Contributor

mattwar commented May 10, 2017

@HaloFour I'm still trying to get my head around it. I understood it a lot better years ago when C-omega work was ongoing, but now I am having a hard time understanding if this is a generally interesting capability or a narrowly focused one. For instance, in order for all the parameters from all the mailbox functions to be available to the async (chorded) method, each mailbox function can only be called once. That's great make in straight forward to understand how everything combines, but may only map to a narrow subset of interesting dialog/protocol patterns. Even if it doesn't solve everything is it an adequate building block? Not sure yet. I'm trying to model it and it seems like every chord grouping has a lot of expensive data structures going on, so it is a bit deceptive on the cost of making the synchronization work.

@John0King
Copy link

John0King commented Jul 1, 2017

Goroutines don’t suffer from “async all the way” problem. Async-await implies that if you make a chain of function calls (A calls B, B calls C, … Y calls Z), and both A and Z are async functions, B … Y also have to be async functions, otherwise the model won’t work (non-async Y can’t await for Z, non-async X can’t await for Y, etc. — which means they either have to “start and forget” corresponding async functions, or wait for them synchronously, or become async as well). On contrary, there is no such constraint in Go: you can read from a channel in any function, and no matter what, it’s always an asynchronous operation. That’s actually a big advantage, since you don’t have to plan on what’s going to be asynchronous ahead of time. In particular, you can write a query(…) method invoking some query provider to get the result, and this provider can do this either synchronously or asynchronously dependently on the implementation — but you, as an author of query(…) method, don’t have to think about this while you write it.

I read from this article

I think the key point here is to make any synchronously method can be use as asynchronously method without reimplement it with async and Task

@benaadams
Copy link
Member

benaadams commented Jul 1, 2017

@John0King Go does it by not giving you the option of using dedicated threads, having transferable thread stacks and having all functions interruptable (by runtime); so no function is really sync; even CPU bound ones and you can't control your scheduling (though you can call LockOSThread to move to blocking mode).

Simplifying, the async keyword means "I don't need to own this thread, and allow this function's stack to be able to saved and resumed", and await are the explicit save/restore points. This causes the "async all the way" as functions that are not-async "own" the thread while they are running; so its about where you mark the jumping off point for thread flexibility. Though "async all the way" is a misnomer and Task(-like) returning is probably better; much like a "sync" is often incorrectly used for "blocking".

.NET gives you more power and control than Go, but can be more complicated as a result. With the most common pitfall being using blocking methods on a threadpool thread; when you don't care about thread ownership; but since you need to be explicit its often missed.

You can even serialize the execution of async methods, transfer it to a different machine and then pick up the execution later...

@CyrusNajmabadi
Copy link
Member

@sgf: Do you have a concrete proposal on what you would like to see improved?

@benaadams
Copy link
Member

benaadams commented Jul 1, 2017

Updated the comparison for C#7.1 and example from the "A Tour Of Go: Goroutines" and matched the formatting to better

Go version

package main

import (
    "fmt"
    "time"
)

func say(s string) {
    for i := 0; i < 5; i++ {
        time.Sleep(100 * time.Millisecond)
        fmt.Println(s)
    }
}

func main() {
    go say("world")
    say("hello")
}

C#7.1 version

using System.Threading.Tasks;
using static System.Console;
using static System.Threading.Tasks.Task;

public class Program
{
    static async Task Say(string s) {
        for (var i = 0; i < 5; i++) {
            await Delay(100);
            WriteLine(s);
        }
    }

    static Task Main() => WhenAll(
        Run(() => Say("world")),
        Say("hello")
    );
}

@skynode
Copy link

skynode commented Oct 19, 2017

Kinda late to the party but interesting thread right here. Unfortunately @sgf is unable to clearly articulate his concerns or make constructive proposals for improvement of C#.

I personally have had cases where I had to think about the program flow while mixing async/await methods with non-async/await methods but I guess it's probably due to some knowledge deficit on my part. For instance, I just learned lately that using Task.Run(() => ) is not quite as performant as the standard procedure? Will work more towards fully understanding the inner mechanics of it.

Any suggested resources would be greatly appreciated @HaloFour @mattwar @benaadams. Thanks all!

@svick
Copy link
Contributor

svick commented Oct 19, 2017

@skynode

For instance, I just learned lately that using Task.Run(() => ) is not quite as performant as the standard procedure? Will work more towards fully understanding the inner mechanics of it.

What "standard procedure"? If you're comparing something like Task.Run(() => M()) with just M(), then of course Task.Run() is going use more CPU, because it has to do more. (Depending on how exactly you use it, it might make your code faster, due to parallelization.)

@HaloFour
Copy link
Contributor

There's yet to be a proposal here. Goroutines can't be implemented in C#. They can't be implemented in any language that isn't Go.

Channels can be implemented in libraries and require no language changes. There already is Rx, TPL DataFlow, Akka.NET, and likely many others that fill that need, as well as the concurrent collections in the BCL.

@masonwheeler
Copy link

masonwheeler commented Oct 20, 2017

@HaloFour

There's yet to be a proposal here. Goroutines can't be implemented in C#. They can't be implemented in any language that isn't Go.

This is trivially untrue by the principle of Turing Equivalence.

A goroutine is really nothing more than a thread, in a language with a built-in syntax keyword for spawning threads. You can already create threads in C#, and Channels are essentially just a blocking queue. There's really nothing there that we don't already have access to. (Yes, it's technically a fiber, but show me one thing you can do with goroutines and channels that you can't trivially accomplish with threads and blocking queues.)

@AlgorithmsAreCool
Copy link

@masonwheeler I think @HaloFour's point is that the concept of a goroutine exists only in go. Similar or equivalent constructs may exist in other languages but they are not strictly speaking goroutines.

I believe the point is that asking for C# to embed a core go concept is the wrong question to ask, instead a proposal for a construct that is native to C# is what is needed to move forward.

@masonwheeler
Copy link

@AlgorithmsAreCool And what new construct would that be?

Again, what can Go do that C# can't do just as easily with threads (or Task.Run, if you prefer) and blocking queues? What problem exists here that we need a solution to?

@omidkrad
Copy link

omidkrad commented May 9, 2018

If async operator operloading was supported, I think we could simply overload some operator like << to emulate <- of Go. And for the co/go keyword, maybe we can overload some unary operator like ~ to wrap Task.Run(() => ...) for us. I know I'm just throwing ideas! :)

@svick
Copy link
Contributor

svick commented May 9, 2018

@omidkrad

Wouldn't await (c << x); be good enough? That's something you could do today with normal operator overloading. EDIT: Turns out it's not possible to do this.

C# intentionally makes every await clearly visible, so I don't think async operator overloading is going to be added to the language.

@omidkrad
Copy link

omidkrad commented May 9, 2018

Sure await (c << x) will do it 👍

Using operator overloading we should also be able to make a short syntax for running taks in parallel (like the go keyword). For example, could be something like these:

~ Say("hello");
go | Say("Hello");
go >> Say("Hello");

@masonwheeler
Copy link

For example, we could have the <- operator

No, we really couldn't, because that already has a well-defined meaning that it's quite realistic to assume will be used in production code: "less than negative".

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented May 9, 2018

Most of my 'go's are around 'funcs'. So the equivalent in C# (which seems fine with me) is just:

using static Task;
//...
Run(() => {
    // all the work
});

Which is basically the same as:

go func() {
    // all the work
}()

Note: these have identical character counts, and I'm not sure i see there as being any real need to make this much better.

@omidkrad
Copy link

omidkrad commented May 9, 2018

Yes, I have to agree it's not much improvement. This one is even more characters:

go >> () => {
    // all the work
};

This one is a little better:

~() => {
    // all the work
};

but I really like this:

await (c << x);

It would be great to have this come with the Channels API.

@HaloFour
Copy link
Contributor

HaloFour commented May 9, 2018

@omidkrad

Per the C# spec the second operand of an overloaded bitshift operator must be an int.

@masonwheeler
Copy link

@HaloFour Probably to prevent exactly this sort of abuse of operator overloads. (See: C++ streams)

@CyrusNajmabadi
Copy link
Member

This one is a little better:

Saying "Run" seems fine to me. I don't see any real value in trying to condense that down any further to a specific character. I mean, it's not like 'go' avoids saying the word 'go' itself. When there's already a very clean and easy way to do things, i don't think there's tremendous value in going overboard on syntactic brevity.

@omidkrad
Copy link

omidkrad commented May 9, 2018

I agree. I'm drawing my suggestion! :)

@masonwheeler
Copy link

When there's already a very clean and easy way to do things, i don't think there's tremendous value in going overboard on syntactic brevity.

Agreed. 2 mch brvt mks thgs hrd 2 nrstnd!

@MI3Guy
Copy link

MI3Guy commented May 9, 2018

@HaloFour Challenge accepted.

https://gist.github.com/MI3Guy/aa8491634410beabbdb6bd042a2ca647

@CyrusNajmabadi
Copy link
Member

That's hilarious.

@masonwheeler
Copy link

Wow, that's kind of horrifying!

@yaakov-h
Copy link
Member

yaakov-h commented May 9, 2018

“kind of”??

@scalablecory
Copy link

I love language wars as much as the next guy, but at what point do we agree that a topic has become an exercise in trolling and bikeshedding and close it...

@CyrusNajmabadi
Copy link
Member

@scalablecory Sometimes it is good to have a honeypot.

@omidkrad
Copy link

omidkrad commented May 9, 2018

Next big C# language should focus on supporting building distributed system and concurrency programming

So the conclusion is: "No, C# already has async/await, use relevant libraries to do complex scenarios"?

Please vote up/down.

@scalablecory
Copy link

@omidkrad that's not how this works. we vote on specific proposals, not vague unsubstantiated musings.

@omidkrad
Copy link

omidkrad commented May 9, 2018

Microsoft once had CCR/DSS toolkit for programming concurrent/distributed software services but it was not easy to use for the average developer. With async/await, C# has come a long way from there but I still believe it should be a lot easier to create concurrent and distributed software. If distributed software is more natively supported in the language/framework that would set the standard for the developers to follow, just as async/await does for asynchronous code.

@FrankSzendzielarz
Copy link

What actually is the suggestion in this thread? If there is one, is it any different from what Rx.NET already offers?

@gafter
Copy link
Member

gafter commented Jun 3, 2019

@FrankSzendzielarz There isn't a specific suggestion, just a suggestion of what kind of thing should be suggested.

@masonwheeler
Copy link

a suggestion of what kind of thing should be suggested.

Getting a little meta, are we? 😛

@HaloFour
Copy link
Contributor

HaloFour commented Jul 15, 2020

@ZiadUber

That's exactly what Java/JVM is doing now with Project Loom. Seems they have it figured out.

Indeed they are, and it'll be an interesting experiment. I think the Java world may be a little more insulated from this given the vast majority of the ecosystem does go through the JRE so theoretically the majority of places where a thread might block can be updated to properly support virtual threading. All third party native code will have to be updated. Blocking the underlying OS thread will always be possible and potentially severely reduce the concurrency of the virtual threads, especially since by default they all share the same common fork/join pool. And I'll be really curious if they'll have a mechanism to detect deadlocking like Go has, probably not. What will be particularly interesting is that Java mixes OS threads and virtual threads, and use of the latter must be deliberate.

I did a LabWeek project with the early access bits of Loom last month and it was a lot of fun. I particularly enjoyed the continuation primitives which allow you to write generators/async without language modifications. And I was pretty impressed with how far virtual threads have been implemented so far. I even used a SynchronousQueue<T> to pass data between multiple virtual threads backed by a single threaded executor, which was kind of mind blowing (well, not if you're a Go programmer, I guess). Virtual threads seem to cost about 2.5k each in heap space vs. 1 MB of stack space by default and I was able to kick off almost 200,000 of them before triggering a sigsegv and crashing the runtime.

Some Loom experiments

@ZiadUber
Copy link

ZiadUber commented Jul 16, 2020

@HaloFour

Thank you for the reply. Those experiments are quite interesting! I haven't experimented with a Loom JVM branch yet, so I was not aware that you could write generators without the overhead of spawning a separate fiber, as what you'd do in go.

Would you mind sharing the rest of the code in the sandbox package (e.g. sandbox.Generators.createGenerator and sandbox.Async.await)? Cheers.

@HaloFour
Copy link
Contributor

@ZiadUber

Here are those two files. Being a LabWeek project they're a little sloppy, the goal being just to get them to work enough to demonstrate the concepts. Enjoy!

Generators.java
Async.java

@Thaina
Copy link

Thaina commented Aug 24, 2020

What difference between Loom and Reactive.Linq though?

@HaloFour
Copy link
Contributor

@Thaina

What difference between Loom and Reactive.Linq though?

Loom is an implementation of delimited continuations in the Java runtime. That allows you to capture the execution context and stack of a thread into a variable (the functional interface Runnable specifically) and then to resume it at some arbitrary point in the future on any thread. From those building blocks you can construct coroutines and green threads (what Loom calls virtual threads). From the former you can add C#-like features like iterators or async/await without requiring special compiler support. The latter combined with a compatible ecosystem lets you write entirely blocking imperative code without actually blocking any underlying threads.

The argument for a model like Loom or Go is that the only real reason for async APIs is to avoid blocking threads, and the only reason to avoid blocking threads is that threads are expensive. If threads were very cheap then there is much less of a reason to have async APIs or async-specific language features. Java put out another early access build in July and they've stabilized the implementation quite a bit. I was able to spin up 2 million virtual threads and "block" them all on a shared lock with a single backing thread and all within 2 GB of memory. 2 million real threads in both Java and C# would require 2 TB of memory, even if they were all blocked and not doing anything.

The tricky bit is that "compatible ecosystem" as anything under the hood that blocks needs to be modified to understand how to yield the virtual thread and queue up completion to resume that virtual thread on some other backing thread. Accidentally blocking real threads can starve all of the virtual threads from being scheduled. Java is taking a little bit of a gamble that they can manage all of the points where blocking can occur from within the runtime and that third-part native implementations of I/O libraries that could block can be expected to be updated.

@333fred
Copy link
Member

333fred commented Aug 24, 2020

There's been a lot of good conversations on this thread. However, as there's no real language proposal here, I'm going to close this out. Feel free to continue using it for discussion if you want to.

@333fred 333fred closed this as completed Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests