Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile System.Private.CoreLib with Crossgen2 by default, take 2 #46051

Closed
wants to merge 1 commit into from

Conversation

trylek
Copy link
Member

@trylek trylek commented Dec 14, 2020

After Bruce's and JanK's fixes I'm no longer able to repro the
previously observed x86 failures in libraries tests locally so I'm
switching back to lab testing; I plan to run several rounds of
PR and outerloop tests to increase my confidence in the change.

Thanks

Tomas

/cc @dotnet/crossgen-contrib

@trylek trylek added this to the 6.0.0 milestone Dec 14, 2020
@mangod9
Copy link
Member

mangod9 commented Dec 14, 2020

Interesting that the failure doesnt repro anymore. Could you please link the PRs for the fixes to this PR if they are available?

@trylek
Copy link
Member Author

trylek commented Dec 14, 2020

JanK's fix: #45999
Bruce's fix: #45947

@mangod9
Copy link
Member

mangod9 commented Dec 15, 2020

Thanks for adding those, so now the only criteria is to have CI pass outerloop jobs for x86?

@trylek
Copy link
Member Author

trylek commented Dec 15, 2020

Hmm, the non-deterministic nature of the bug makes it hard to define a precise criterion. On top of that, based on looking at the legs, I suspect that outerloop testing actually adds very little to libraries testing I'm most interested in at this moment. I guess that I'll just rerun the PR something like 5~10 times and I'll keep my fingers crossed - merging this in the next couple of days has the "thanksgiving advantage" that any irregularities will be easily seen in the reduced traffic of the holiday period. Having said that, the current working theory is that CG2 wasn't at fault at all by itself, it's just that some of the subtle diffs in code placement and import cell management aggravated the pre-existing GC hole and tailcall optimization bug that Bruce and JanK subsequently fixed.

@trylek trylek closed this Dec 15, 2020
@trylek trylek reopened this Dec 15, 2020
@trylek trylek closed this Dec 16, 2020
@trylek trylek reopened this Dec 16, 2020
@trylek trylek closed this Dec 17, 2020
@trylek trylek reopened this Dec 17, 2020
After Bruce's and JanK's fixes I'm no longer able to repro the
previously observed failures in libraries tests locally so I'm
switching back to lab testing; I plan to run several rounds of
PR and outerloop tests to increase my confidence in the change.

Thanks

Tomas
@@ -448,20 +448,7 @@ private Object HandleToObject(IntPtr handle)

private bool Get_CORINFO_METHOD_INFO(MethodDesc method, MethodIL methodIL, CORINFO_METHOD_INFO* methodInfo)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both crossgen1 and JIT are bailing out for PInvokes from getMethodInfo as well here:

if (!ftn->IsDynamicMethod() && (!ftn->IsIL() || !ftn->GetRVA() || ftn->IsWrapperStub()))

Why is this change necessary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does not appear to be fixing the x86 crash based on my quick local test. Were you able to confirm locally that it is fixing the problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I believe I see a CG1 vs. CG2 diff in NgenDump I tracked down to this difference, I have rolled back the 2nd commit based on your clarification that the error I've been attempting to fix is unrelated to my change. Amusingly this is about the 2nd or 3rd time the SPC CG2 switchover revealed a bug that ended up being completely unrelated to the switchover as such. Let's hope it just means that we do take CG2 testing seriously ;-).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I believe I see a CG1 vs. CG2 diff in NgenDump I tracked down to this difference,

Do you still see the diff in the current master? If yes, could you please share the details of the diff?

@trylek trylek changed the title WIP: Compile System.Private.CoreLib with Crossgen2 by default, take 2 Compile System.Private.CoreLib with Crossgen2 by default, take 2 Jan 5, 2021
@trylek
Copy link
Member Author

trylek commented Jan 5, 2021

I have removed the WIP status as the only remaining bug revealed by repeated testing of this change turns out to be unrelated to it - it's apparently #46184 based on JanK's clarification. I'll run another round of outerloop testing but right now I'm not aware of any remaining issues actually caused by this switch-over so I believe it's ready to be merged in.

@trylek
Copy link
Member Author

trylek commented Jan 14, 2021

Superseded by #47019, closing.

@trylek trylek closed this Jan 14, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Feb 14, 2021
@trylek trylek deleted the SPCGC2_2ndAttempt branch January 10, 2022 19:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants