-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spurious reasoning for C# over Python #3
Comments
Hi, I appreciate the feedback, however I think you read that with a particularly negative minded light, so I'd ask if you can take a moment to understand the intended meaning clearly before we continue (and, when you feel you do understand how it's intended, provide suggestion(s) for how to phrase it more clearly so that the intention gets across).
|
okay. those are good points. i'm sorry if my comment came across as overly critical. I figured the details I provided would be enough on a technical level. I have taken some time to rewrite it based on your feedback, so that there is a suggested change which outlines the strengths of the C# approach while ensuring the Python is still outlined:
|
I would like to echo the sentiment from the OP that the ML space is entirely in python, and by developing in .NET, you are effectively isolating your tool from the ability of many would-be contributors from this space to contribute. That audience is the most likely to be engaged in actually contributing.
I feel there may be a gap in your understanding about the performance limitations of python. The points talked about here sound academic, and are probably rooted in a time when a historical python version (such as 2.7) was the norm. As python is now, I find the idea that the performance characteristics of python necessarily being a limiting factor as outdated at best. Between plain old threading, coroutines, and multiprocessing - all features of the standard library - and smart architecture, I suspect any well enough engineered system could - in pure python - stand up to the rigor of the features needed to make the Swarm project work.
And to reiterate, while you're not technically removing anything, you're obscuring the ability of anyone already in this space and not familiar with C# to contribute or consume it in a more meaningful way. |
thank you for echoing my original sentiment, and expanding upon it. however, I agree with Alex. my initial reaction was knee-jerk based on the same incorrect assertions made in the README.md - but having an expanded ecosystem in the direction of C# is unlikely to harm the python ecosystem. on the contrary, developers who encounter the limitations of C# development are likely to look toward solutions like huggingface/diffusers and make the jump into learning Python. but if they never hit those limits, and the available C# solution works for a C# developer, that is a net win for StabilityAI, and C# developers. |
I didn't imply, nor do I believe, that choosing .NET is an inherently bad idea, nor do I think it will harm anything. I do, however, think the reasons given for choosing it over python are spurious at best. |
yes. i think "we just wanted to do this for the C# community, and it's what we are comfortable with" makes total sense too. |
Sure. That's a business decision that I wouldn't argue with. They are not compelled to use any one technology, but as originally stated, its not accurate to say python is fundamentally less capable to do the stated goals. |
I'm still unsure how to best rewrite this, but this gets a little closer/clearer for the moment.
As noted here: 8d02e78#r122956544, it sounds like you or your team prefers C#, which is fine, but the claims - particularly about multi-core use - are flatly incorrect. I see no harm in changing the language to emphasize that C# is your/your team's preferred tool to accomplish your goals using technology you understand, rather than making blanket statements about an entire programming language which do not seem to bear the weight of what is claimed. |
The fact that python can use more cores by spinning up more processes is technically correct but not relevant, that is not a high-performance multicore solution, that's a hack. Python is simply not an ideal choice for projects that need high performance multicore processing. The decision was not made flippantly. I did significant testing. Some of my experimental 1:1 code comparisons ran literally orders of magnitude faster in C# than they did in Python, even the singlethreaded ones. I chose the high-performance option because it was the high-performance option. You can have your own opinions and preferences, if you want to use python for things that python is less capable at than other tools, that's your choice, but I chose to use the technology that does better at the goals we needed, and I tried my best explained that choice in the readme. If you would like to help improve the phrasing of that explanation, that's welcome, but the aggressive response here is not needed. ps I appreciate your responses here @bghira I don't think that the suggested rewrite there would be best, just because - well, it's a 5 paragraph essay replacing what was meant to be a couple sentences and is already overly long. |
@mcmonkey4eva i would have been disappointed if you'd used the text as-is. i just wanted to give you more feedback and help nudge it in the direction you wanted. what tazlin is saying is also unfortunately correct. the fact that you go out of your way to make performance comparisons is really going to 'upset' some people, who will then go out of their way to bring it up here, probably again and again. it seems like a few charts of tests you've done would be helpful, but if that's out of scope, maybe just removing the mentions of python from the readme is the way to go. there's no point in bringing up a comparison if we can't see the test methodology or the results. |
for additional data, reference this old stackoverflow answer about Python vs C# performance. simply put, out of the box, Python doesn't perform as well. but using tools like PyPy or especially cython will bring python's performance beyond that of C#. but we don't have the tests, or the data. and so these questions/discussions will probably resurface until that is a settled question. thus is the nature of FOSS. |
So, update, it's finally doc-writing time, so - the Motivations are now in their own separate doc, have been expanded, and there's an explicit note of |
i have an updated translation for your consideration, now that it is a separate living document:
|
This comment was marked as off-topic.
This comment was marked as off-topic.
please, no more arguing about it. because you're going to hear facts that disagree with you. and this will drag ON AND ON. anyone who responds further here will receive a free block from me, and it will no longer be possible to do code review on my pull requests, or comment on any issues I open. |
It's true that Python is an interpreted language, which makes it slower compared to compiled languages like C#. However, in the context of machine learning (ML) and scientific computing, Python's performance is not usually the bottleneck. The heavy computations in ML are offloaded to highly-optimized C/C++ libraries (like NumPy or PyTorch's underlying libraries) or to GPU-accelerated libraries like CUDA. This means that, in practice, ML code written in Python can run just as quickly as similar code written in other languages.
While Python has a Global Interpreter Lock (GIL) that can limit the effectiveness of multithreading by allowing only one native thread to execute at a time. However, this does not mean that Python "lacks true multithreading capabilities". Python's threading module does support multithreading, although it might not provide a speedup for CPU-bound tasks due to the GIL. However, for I/O-bound tasks (like loading data from disk), Python's multithreading can provide a significant speedup. Moreover, Python has other ways to achieve parallelism, like multiprocessing (via the multiprocessing module), and many of the underlying C/C++ libraries used in ML can effectively utilize multi-core processors independent of Python's GIL.
Developer Ecosystem: Python is currently the dominant language in the data science and machine learning field. Most of the major ML libraries (like TensorFlow, PyTorch, SciKit-Learn, etc.) have been developed in and for Python, and Python's simple syntax and vast ecosystem of scientific computing libraries make it very accessible to new developers. While developing ML tools in C# might indeed attract developers from the C# ecosystem, it does not necessarily "enable a wider range of developers to make use of Stable Diffusion". In fact, it could limit the reach of these tools, as they would not be usable by the large number of developers who are already working in the Python ecosystem.
Can you please update the text to be more accurate?
The text was updated successfully, but these errors were encountered: