-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider reference counter as alternative garbage collector #4029
Comments
@mirhagk, thank you for opening this very interesting issue :-) I think you'd be pleased to know that I have actually already been thinking about a ref counting GC :-) so indeed this is within our interest. As you pointed out, obviously this would be a very big change and I don't have a plan yet (still in the thinking stage) but I will keep you updated when I have something to share. Our background GC was a big step in the direction that optimizes for better latency and as seen in many of our customers' scenarios it has worked out great. I do have more performance tuning work to do to make it more predictable (it's shorter pause but the pauses are dictated by the allocations, not by time or CPU). On your comment about "a little out of date wrt academic papers", note that many academic papers focus on theory and do not try to accommodate a very wide range of scenarios (which our GC has to do). With that said, of course I'm always in search of interesting GC papers to read. |
Yes of course, I'm not going to pretend I'm anywhere near an expert on garbage collection, but I am aware of the Immix algorithm, which supposedly gets space efficiency, fast collection, and good mutator performance. It claims to be 7-25% faster than semi-space, mark-sweep and mark-compact, and also claims to be 5% faster than an already highly tuned garbage collector. I don't know if Immix has real-world problems that prevent it's usefulness, or if it's simply a matter of the current GC still having room for improvement, and not needing to make such drastic switches for a relatively meager performance gain. I think mostly where the .NET framework lacks w.r.t. scientific papers is static garbage collection, or simply garbage elimination. I don't know if it's something .NET can address, except in the case of .NET native. The C# compilers can't do significant garbage elimination (AFAIK stack allocation is slower in .NET than heap allocation, so it can't even do the "easy" ones from escape analysis). And the JIT compiler doesn't have the luxury of spending a couple seconds on trying to eliminate garbage while compiling, as it needs to start executing code as fast as possible. Offtopic to this discussion, but is the current GC closely coupled to the rest of the framework, or would it be plausible for someone not closely familiar with the framework (like myself) to use .NET as a playground for garbage collection experimentation? |
I believe I recently read/heard something about the GC being designed quite standalone on purpose, to make it easy to substitute it with a different one as long as it adheres to the contract of the runtime... or that might've been regarding the JIT. Not 100% sure about it now. |
@Joe4evr The GC is not tightly coupled with the framework/runtime. https://github.com/dotnet/coreclr/tree/master/src/gc/sample is an examples that shows how to use the GC without the rest of CoreCLR. We are interested in changes to make this example better. |
@richlander Pretty much, except I have like 0 knowledge of C++ myself, I was just mentioning a point for discussion. |
The "alt-gc" direction is possible, but not as easy. The GC features that the CoreCLR runtime depends on (for full functionality) are pretty rich. It includes features that are not always found in experimental GC implementations out there such as interior pointers or pinning. It is non-trivial to build an alternative implementation of the rich advanced feature set. Integrating a new GC was actually done as research project back in 2003 on top of Rotor. Some of the difficulties highlighted by the project report are not valid anymore - the interface between the CoreCLR runtime and GC is much cleaner than it used to be back in 2003. However, a lot of it is still holds - the issues listed in 5.1 "Integration Issues and Solutions" in particular. |
@jkotas @Maoni0 then do you think its possible to port Azul C4 to CoreCLR as AltGC if the mechanism is supported ? MS has a partnership with Azul on Java for Azure stuff AFAIK. |
@WalkingCat C4 requires kernel patches for paging notifications, no? |
@kangaroo so what ? Azul has already done that work on Linux, and MS owns Windows kernel, isn't it plausible ? |
@WalkingCat So nothing -- sorry, didn't mean that to be a negative, more of a confirmation. The Azul GC is very interesting, and its certainly plausible, I just wanted to ensure that everyone was aware of the scope of the question. |
@WalkingCat I think @kangaroo was suggesting that the scope of taking on a GC project that required updates to the Windows kernel might be a bit far reaching. I suspect we'd have to come up with a strong business driver to get the kernel team to do the work, as opposed to being the first alt-GC to try out a new GC extensibility point. I also suspect that @Maoni0 will only invest in an alt-GC mechanism when there is a concrete proposal to bring a new GC to .NET. Such a proposal would also need to satisfy the issues that @jkotas raised in the document he references. |
@Maoni0 Are you an employee of Microsoft? Please consider this idea: https://github.com/dotnet/coreclr/issues/555 |
@Maoni0 |
@ygc369 The issue with doing this is that freeing memory like this causes fragmentation. One of the big speed benefits of garbage collection is it's ability to reclaim large chunks of memory at once, and use bump pointer allocation, rather than a free list. Doing this manual reclamation would probably actually cause worse performance, especially in the case where there isn't a lot of pressure. A way the garbage collector/compiler could take advantage of it is to allocate objects in a separate nursery, and collect that entire nursery all at once. I think a much easier and better option would be to make stack allocation quicker, and allocate objects like this on the stack, where they are quickly reclaimed. The compiler detecting and releasing memory on the heap is more flexible (as it could statically know when certain objects go out of scope, and their properties become garbage etc), but the stack based one would be a quick win. I could be wrong, but I believe the |
@richlander One of the motivations behind implementing something akin to C4 is that GC performance has contributed to lost contracts in the past. It was not the sole factor in the London Stock Exchange's 2009 decision to drop .NET (in favour of C++, I believe), but it was a consideration. Azul's customers are, I'm sure, predominantly financial institutions. It seems unfortunate that .NET is (arguably) not competitive in this space. What I'm less clear on is how well C4 performs on small heaps. Being able to eliminate the pauses is desirable for much more than financial number crunching, and would be nice to have even on small systems (such as phones and PCs). |
@Maoni0 I think a complete memory management mechanism should have the following three key features:
However, unfortunately, C# only has GC now. |
I think ARC can transparently be used for any classes, that provably do not have cycles. I believe to mark a class as an ARC class, the only requirement is for it to only contain fields of:
They also must be sealed or carry a special attribute, that would prevent adding any non-ARC fields when the type is inherited from. This outright would apply to As long as it does not contradict current behavior, I would simply call their finalizer the second their reference count drops to zero (I realize there might be a problem with resurrection, which may need another bit that would prevent finalization running second time unless |
There is no plan to adopt ARC. I'm going to close this issue as a result. |
Tracing garbage collection is great for throughput and C# has a pretty well tuned garbage collector (even if it's a little out of date w.r.t. academic papers). But there are scenarios where throughput does not matter as much as consistent predictable performance, or scenarios that use a lot of memory and can't afford to pay the price of a tracing collector in either pausing or memory consumption.
Video games are a very common situation where tracing collection can be disastrous, and a lot of programmers have to work around this with object pools and the like, essentially managing memory themselves.
It'd be nice if it was possible to swap out the garbage collector in .NET (like it is possible with many java implementations), and a reference counter would be a useful one to swap out for. Of course it should not change any guarantees that the language has (so there'll have to be a backup cycle collector) but merely change throughput, pausing, performance and memory consumption.
This would obviously be a very big change, as the GC is very tuned, and quite a lot of code, and the new reference GC would need to be able to hook in to assignments and variable scope, but it has the potential to make C# that much more attractive to certain scenarios (such as game programming).
Alternatively this could be a consideration for .NET native only, as .NET native already seeks to reduce memory consumption on constrained devices, and reference counting has many similar goals. This is basically a request to consider reference counting as an alternative garbage collector somewhere in the .NET ecosystem
The text was updated successfully, but these errors were encountered: