Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider reference counter as alternative garbage collector #4029

Closed
mirhagk opened this issue Mar 11, 2015 · 20 comments
Closed

Consider reference counter as alternative garbage collector #4029

mirhagk opened this issue Mar 11, 2015 · 20 comments

Comments

@mirhagk
Copy link
Contributor

mirhagk commented Mar 11, 2015

Tracing garbage collection is great for throughput and C# has a pretty well tuned garbage collector (even if it's a little out of date w.r.t. academic papers). But there are scenarios where throughput does not matter as much as consistent predictable performance, or scenarios that use a lot of memory and can't afford to pay the price of a tracing collector in either pausing or memory consumption.

Video games are a very common situation where tracing collection can be disastrous, and a lot of programmers have to work around this with object pools and the like, essentially managing memory themselves.

It'd be nice if it was possible to swap out the garbage collector in .NET (like it is possible with many java implementations), and a reference counter would be a useful one to swap out for. Of course it should not change any guarantees that the language has (so there'll have to be a backup cycle collector) but merely change throughput, pausing, performance and memory consumption.

This would obviously be a very big change, as the GC is very tuned, and quite a lot of code, and the new reference GC would need to be able to hook in to assignments and variable scope, but it has the potential to make C# that much more attractive to certain scenarios (such as game programming).

Alternatively this could be a consideration for .NET native only, as .NET native already seeks to reduce memory consumption on constrained devices, and reference counting has many similar goals. This is basically a request to consider reference counting as an alternative garbage collector somewhere in the .NET ecosystem

@Maoni0
Copy link
Member

Maoni0 commented Mar 11, 2015

@mirhagk, thank you for opening this very interesting issue :-) I think you'd be pleased to know that I have actually already been thinking about a ref counting GC :-) so indeed this is within our interest. As you pointed out, obviously this would be a very big change and I don't have a plan yet (still in the thinking stage) but I will keep you updated when I have something to share.

Our background GC was a big step in the direction that optimizes for better latency and as seen in many of our customers' scenarios it has worked out great. I do have more performance tuning work to do to make it more predictable (it's shorter pause but the pauses are dictated by the allocations, not by time or CPU).

On your comment about "a little out of date wrt academic papers", note that many academic papers focus on theory and do not try to accommodate a very wide range of scenarios (which our GC has to do). With that said, of course I'm always in search of interesting GC papers to read.

@mirhagk
Copy link
Contributor Author

mirhagk commented Mar 12, 2015

, note that many academic papers focus on theory and do not try to accommodate a very wide range of scenarios

Yes of course, I'm not going to pretend I'm anywhere near an expert on garbage collection, but I am aware of the Immix algorithm, which supposedly gets space efficiency, fast collection, and good mutator performance. It claims to be 7-25% faster than semi-space, mark-sweep and mark-compact, and also claims to be 5% faster than an already highly tuned garbage collector.

I don't know if Immix has real-world problems that prevent it's usefulness, or if it's simply a matter of the current GC still having room for improvement, and not needing to make such drastic switches for a relatively meager performance gain.

I think mostly where the .NET framework lacks w.r.t. scientific papers is static garbage collection, or simply garbage elimination. I don't know if it's something .NET can address, except in the case of .NET native. The C# compilers can't do significant garbage elimination (AFAIK stack allocation is slower in .NET than heap allocation, so it can't even do the "easy" ones from escape analysis). And the JIT compiler doesn't have the luxury of spending a couple seconds on trying to eliminate garbage while compiling, as it needs to start executing code as fast as possible.

Offtopic to this discussion, but is the current GC closely coupled to the rest of the framework, or would it be plausible for someone not closely familiar with the framework (like myself) to use .NET as a playground for garbage collection experimentation?

@Joe4evr
Copy link
Contributor

Joe4evr commented Mar 12, 2015

Offtopic to this discussion, but is the current GC closely coupled to the rest of the framework, or would it be plausible for someone not closely familiar with the framework (like myself) to use .NET as a playground for garbage collection experimentation?

I believe I recently read/heard something about the GC being designed quite standalone on purpose, to make it easy to substitute it with a different one as long as it adheres to the contract of the runtime... or that might've been regarding the JIT. Not 100% sure about it now.

@jkotas
Copy link
Member

jkotas commented Mar 12, 2015

@Joe4evr The GC is not tightly coupled with the framework/runtime. https://github.com/dotnet/coreclr/tree/master/src/gc/sample is an examples that shows how to use the GC without the rest of CoreCLR. We are interested in changes to make this example better.

@richlander
Copy link
Member

@jkotas That's good. I was thinking of the same sample when I read the comment from @Joe4evr. That said, I read his comment differently, which wasn't that he wanted to use the GC in a different runtime, but use a different GC with CoreCLR. Basically "alt-gc".

@Joe4evr
Copy link
Contributor

Joe4evr commented Mar 13, 2015

@richlander Pretty much, except I have like 0 knowledge of C++ myself, I was just mentioning a point for discussion.

@jkotas
Copy link
Member

jkotas commented Mar 13, 2015

The "alt-gc" direction is possible, but not as easy.

The GC features that the CoreCLR runtime depends on (for full functionality) are pretty rich. It includes features that are not always found in experimental GC implementations out there such as interior pointers or pinning. It is non-trivial to build an alternative implementation of the rich advanced feature set.

Integrating a new GC was actually done as research project back in 2003 on top of Rotor. Some of the difficulties highlighted by the project report are not valid anymore - the interface between the CoreCLR runtime and GC is much cleaner than it used to be back in 2003. However, a lot of it is still holds - the issues listed in 5.1 "Integration Issues and Solutions" in particular.

@WalkingCat
Copy link

@jkotas @Maoni0 then do you think its possible to port Azul C4 to CoreCLR as AltGC if the mechanism is supported ? MS has a partnership with Azul on Java for Azure stuff AFAIK.
http://www.azulsystems.com/technology/c4-garbage-collector

@kangaroo
Copy link
Contributor

@WalkingCat C4 requires kernel patches for paging notifications, no?

@WalkingCat
Copy link

@kangaroo so what ? Azul has already done that work on Linux, and MS owns Windows kernel, isn't it plausible ?

@kangaroo
Copy link
Contributor

@WalkingCat So nothing -- sorry, didn't mean that to be a negative, more of a confirmation. The Azul GC is very interesting, and its certainly plausible, I just wanted to ensure that everyone was aware of the scope of the question.

@richlander
Copy link
Member

@WalkingCat I think @kangaroo was suggesting that the scope of taking on a GC project that required updates to the Windows kernel might be a bit far reaching. I suspect we'd have to come up with a strong business driver to get the kernel team to do the work, as opposed to being the first alt-GC to try out a new GC extensibility point.

I also suspect that @Maoni0 will only invest in an alt-GC mechanism when there is a concrete proposal to bring a new GC to .NET. Such a proposal would also need to satisfy the issues that @jkotas raised in the document he references.

@ygc369
Copy link

ygc369 commented Mar 25, 2015

@Maoni0 Are you an employee of Microsoft? Please consider this idea: https://github.com/dotnet/coreclr/issues/555

@ygc369
Copy link

ygc369 commented Mar 25, 2015

@Maoni0
In Apple's LLVM, the compiler can know when to free memory and add free operations automatically, can .NET compiler do the same thing? Even though the .NET compiler can't identify all garbage objects without reference count in compile time, it can at least identify some of them in many cases. For example:
void example(int n)
{
int[] a=new int[n];
return;
/* if the compiler is smart enough, it should know the array "a" is garbage here, because "a" is not recorded on the heap, nor in static variables. So the compiler can automatically insert "free a" operation before return. Of course, never allow programmers to free objects manually.*/
}
If the compiler can do some work of GC, many garbage objects can be collected much earlier and more quickly before real GC, and the pressure of GC will be lower.The rule is "Collect objects which the compiler is 100% sure to be garbage at once, and leave the rest to GC".

@mirhagk
Copy link
Contributor Author

mirhagk commented Mar 25, 2015

@ygc369 The issue with doing this is that freeing memory like this causes fragmentation. One of the big speed benefits of garbage collection is it's ability to reclaim large chunks of memory at once, and use bump pointer allocation, rather than a free list. Doing this manual reclamation would probably actually cause worse performance, especially in the case where there isn't a lot of pressure.

A way the garbage collector/compiler could take advantage of it is to allocate objects in a separate nursery, and collect that entire nursery all at once.

I think a much easier and better option would be to make stack allocation quicker, and allocate objects like this on the stack, where they are quickly reclaimed. The compiler detecting and releasing memory on the heap is more flexible (as it could statically know when certain objects go out of scope, and their properties become garbage etc), but the stack based one would be a quick win.

I could be wrong, but I believe the stackalloc il isntruction is quite slow. If this is something that could be sped up, then escape analysis could be performed by the compiler and optimize new to stackalloc

@ygc369
Copy link

ygc369 commented Mar 25, 2015

@mirhagk What you said is a great idea! To alloc temp objects on the stack will lower the pressure of GC. The only limit is that objects allocated on the stack can't be too large and can't be referenced by the heap or static variables. Microsoft should consider this idea. @Maoni0

@DrPizza
Copy link

DrPizza commented Sep 3, 2015

@richlander One of the motivations behind implementing something akin to C4 is that GC performance has contributed to lost contracts in the past. It was not the sole factor in the London Stock Exchange's 2009 decision to drop .NET (in favour of C++, I believe), but it was a consideration.

Azul's customers are, I'm sure, predominantly financial institutions. It seems unfortunate that .NET is (arguably) not competitive in this space.

What I'm less clear on is how well C4 performs on small heaps. Being able to eliminate the pauses is desirable for much more than financial number crunching, and would be nice to have even on small systems (such as phones and PCs).

@ygc369
Copy link

ygc369 commented Jun 29, 2017

@Maoni0
Is there any progess with this issue?
I want an optional ARC to work together with GC, reducing workload of GC. They can help each other, ARC deal with most cases of freeing memory while GC deal with ref cycles and compact heap.
GC should never be turned off, no matter whether ARC is on or off.
ARC should be optional. Programmers should have the right to decide whether to turn it on. (default off)

I think a complete memory management mechanism should have the following three key features:

  1. EA (escape analysis). Allocate obviously short-lived objects on the stack, instead of heap.
  2. ARC (automatic reference count). This should be optional, but once turned on, it should be able to collect most of garbage.
  3. GC (garbage collect). It should collect the rest of garbage which EA and ARC can not deal.

However, unfortunately, C# only has GC now.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the Future milestone Jan 30, 2020
@lostmsu
Copy link

lostmsu commented May 5, 2022

I think ARC can transparently be used for any classes, that provably do not have cycles. I believe to mark a class as an ARC class, the only requirement is for it to only contain fields of:

  • unmanaged types (in the sense of C# unmanaged constraint)
  • types that previously were determined to be ARC
  • maybe also arrays of types previously determined to be ARC-compatible

They also must be sealed or carry a special attribute, that would prevent adding any non-ARC fields when the type is inherited from.

This outright would apply to System.String, and any classes that wrap native handles. The later covers the scenario I have in mind: Tensor class in TorchSharp.

As long as it does not contradict current behavior, I would simply call their finalizer the second their reference count drops to zero (I realize there might be a problem with resurrection, which may need another bit that would prevent finalization running second time unless GC.ReRegisterForFinalization is called).

@richlander
Copy link
Member

There is no plan to adopt ARC. I'm going to close this issue as a result.

@ghost ghost locked as resolved and limited conversation to collaborators Jun 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests