Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to not generate precise GC info #75817

Closed
wants to merge 4 commits into from

Conversation

MichalStrehovsky
Copy link
Member

Follow up to #75803.

If enabled, conservative GC stack scanning will be used and metadata related to GC stack reporting will not be generated. The generated executable file will be smaller, but the GC will be less efficient (garbage collection might take longer and keep objects alive for longer periods of time than usual).

Saves 4.4% in size on a Hello World. I'll take that.

Cc @dotnet/ilc-contrib

Follow up to dotnet#75803.

If enabled, conservative GC stack scanning will be used and metadata related to GC stack reporting will not be generated. The generated executable file will be smaller, but the GC will be less efficient (garbage collection might take longer and keep objects alive for longer periods of time than usual).

Saves 4.4% in size on a Hello World. I'll take that.
@EgorBo
Copy link
Member

EgorBo commented Sep 19, 2022

Some libraries tests are guarded with IsPreciseGcSupported which is currently implemented as "Not mono" - perhaps, worth adding this mode here for NativeAOT?

@VSadov
Copy link
Member

VSadov commented Sep 19, 2022

Interesting. This could be a useful option for platforms that do not fully support stack walking.

Saves 4.4% in size on a Hello World

I am curious at %% savings for larger apps. Assuming that there is fixed native runtime size, larger apps may benefit a bit more.
Is it easy to measure the diff for ilc ?

@MichalStrehovsky
Copy link
Member Author

Some libraries tests are guarded with IsPreciseGcSupported which is currently implemented as "Not mono" - perhaps, worth adding this mode here for NativeAOT?

CoreCLR is also capable of running in this mode. It might be worth it if we're adding official testing for this. It's not what I'm doing right now. (I don't know if there's a good way to probe for this.)

I am curious at %% savings for larger apps. Assuming that there is fixed native runtime size, larger apps may benefit a bit more.

It will be once this merges - about 15% of hello world (500 kB of 3.6 MB) is native code that isn't affected by this. This percentage will be smaller for larger apps. But 15% is already a pretty small number.

@VSadov
Copy link
Member

VSadov commented Sep 19, 2022

I don't know if there's a good way to probe for this.

For the testing purposes, since you want to enable this and do a test pass, you could just set it to true temporarily - to reduce noise.

@VSadov
Copy link
Member

VSadov commented Sep 19, 2022

It is not a lot of noise though typically. False positives when doing conservative stack scans are relatively rare.

@EgorBo
Copy link
Member

EgorBo commented Sep 19, 2022

It is not a lot of noise though typically. False positives when doing conservative stack scans are relatively rare.

It was pretty annoying when Mono was wired up to run libs tests 🙂

@VSadov
Copy link
Member

VSadov commented Sep 19, 2022

It was pretty annoying when Mono was wired up to run libs tests

It does not need to be frequent for a failure to become annoying. :-)
And I guess they were a few different tests every time.

Statistically though stacks are shallow and in a big(ish) program most objects would be rooted by statics and/or a few long-lived stack roots that hold large portions of the app context.

Conservative scanning is a bigger problem when both stack and heap are conservative. When it is just stacks, there are fewer opportunities.

@MichalStrehovsky
Copy link
Member Author

exit code 139 means SIGSEGV Illegal memory access

Cool. I was surprised we can just not generate it and GC.Collect works. It's probably only GC.Collect that works.

Will probably have to leave a breadcrumb somewhere for the runtime not to expect it.

@VSadov
Copy link
Member

VSadov commented Sep 19, 2022

Do we need GC info for exception handling?

@jkotas
Copy link
Member

jkotas commented Sep 19, 2022

garbage collection might take longer and keep objects alive for longer periods of time than usual

We have done some measurements around this some years back. The average perf hit for real world server (compute bound) apps is about 20% rps.

CoreCLR is also capable of running in this mode.

CoreCLR is not reliable (ie will crash intermittently) with conservative stack scanning. Dynamic methods and collectible assemblies are not able to handle situation when unreachable object becomes reachable again.

@VSadov
Copy link
Member

VSadov commented Sep 19, 2022

It would be interesting to do similar comparison with NativeAOT if we have a good benchmark

Compared with CoreCLR there are two differences:

  • on CoreCLR enabling conservative disables asynchronous GC suspension - it becomes polling-only. (the stack walking machinery starts reporting - "I know nothing, this may be not jitted code"). That can make suspension pauses longer during which neither app nor GC are working, thus impact on throughput.
    In NativeAOT conservative stack reporting still supports asyc suspension in "everything managed is a safepoint" mode. That may actually lead to faster suspensions than in regular case.

  • on CoreCLR enabling conservative mode enables both conservative stack reporting and support for that in GC.
    In NativeAOT it is only the first part. The GC support is turned on unconditionally, so the diff from enabling conservative is less.

It is hard to tell how much the effect the differences make. The second is probably insignificant, but there is a chance the first has measurable impact.

The place where CoreCLR turns off async suspension:

// Conservative GC enabled; behave as if HIJACK_NONINTERRUPTIBLE_THREADS had not been

@VSadov
Copy link
Member

VSadov commented Sep 19, 2022

Actually USE_GC_INFO_DECODER seems to be off only on x86, so perhaps there is no difference between CoreCLR and NativeAOT.

@MichalStrehovsky
Copy link
Member Author

Do we need GC info for exception handling?

The crash is happening here:

GcInfoDecoder decoder(GCInfoToken(p), DECODE_REVERSE_PINVOKE_VAR);
INT32 slot = decoder.GetReversePInvokeFrameStackSlot();
assert(slot != NO_REVERSE_PINVOKE_FRAME);
TADDR basePointer = NULL;
UINT32 stackBasedRegister = decoder.GetStackBaseRegister();
if (stackBasedRegister == NO_STACK_BASE_REGISTER)
{
basePointer = dac_cast<TADDR>(pRegisterSet->GetSP());
}
else
{
basePointer = dac_cast<TADDR>(pRegisterSet->GetFP());
}

So we need GC info to unwind reverse P/invokes. Looks like we would need to generate some sort of minimal GC info to allow for that. (There might be more - it's just the crash that I looked at.)

I'm going to close this. It was only worth it if it's reasonably cheap since it would likely stay an obscure undocumented switch anyway.

@AndyAyersMS
Copy link
Member

Yeah, GC info conveys some non-GC info, so you can't skip emitting it entirely.

@ghost ghost locked as resolved and limited conversation to collaborators Oct 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants