Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

ARM64: Fix WriteBarrier #5821

Merged
merged 1 commit into from
Jun 16, 2016
Merged

ARM64: Fix WriteBarrier #5821

merged 1 commit into from
Jun 16, 2016

Conversation

kyulee1
Copy link

@kyulee1 kyulee1 commented Jun 15, 2016

I think this finally fixes most of long outstanding GC issues that appear in different
forms so far that include #4877, #4879, #4890 and more.

The issues was WriteBarrier implementation.
When we update card table, the address has been already incremented by 8.

With this fix, all corefx tests seem to run on xunit framework without
crashing. I verified this with an out-of-date build locally, so there are not found
assembly errors like System.Runtime.Extensions with different versions,
though. These are what I tested so far that safely finished all runs with very high percentage of pass.
System.Collections.Tests
System.Dynamic.Runtime.Tests
System.Linq.Expressions.Tests

Potentially more coreclr tests will pass. I will update them as tests go.

@kyulee1
Copy link
Author

kyulee1 commented Jun 15, 2016

@dotnet-bot test Windows_NT arm64 Checked

@kyulee1
Copy link
Author

kyulee1 commented Jun 15, 2016

@dotnet-bot test Windows_NT arm64 Checked pri1r2r

@kyulee1
Copy link
Author

kyulee1 commented Jun 15, 2016

@dotnet/arm64-contrib @dotnet/jit-contrib PTAL

@pgavlin
Copy link

pgavlin commented Jun 15, 2016

cc @Maoni0 @jkotas

@jkotas
Copy link
Member

jkotas commented Jun 16, 2016

Why do we increment x14 in the write barrier at all? It is different from write barrier contract from everywhere else.

@kyulee1
Copy link
Author

kyulee1 commented Jun 16, 2016

@jkotas I had the same question to myself, but I just followed other jit writebarriers including CheckedWriteBarrier with the explicit comment about argument. Who implement this BTW?

@briansull
Copy link

briansull commented Jun 16, 2016

I implemented it for ARM64.

The JIT_ByRefWriteBarrier needs to update both the src and dst pointers.
It does this by updating the src pointer and branching to the CheckedWriteBarrier which updates the dst pointer.

@briansull
Copy link

briansull commented Jun 16, 2016

; void JIT_ByRefWriteBarrier
; On entry:
; x13 : the source address (points to object reference to write)
; x14 : the destination address (object reference written here)
;
; On exit:
; x12 : trashed
; x13 : incremented by 8
; x14 : incremented by 8
; x15 : trashed
;
WRITE_BARRIER_ENTRY JIT_ByRefWriteBarrier

    ldr      x15, [x13], 8
    b        JIT_CheckedWriteBarrier

WRITE_BARRIER_END JIT_ByRefWriteBarrier 

@briansull
Copy link

briansull commented Jun 16, 2016

Looks Good

@rahku
Copy link

rahku commented Jun 16, 2016

looks good

@kyulee1
Copy link
Author

kyulee1 commented Jun 16, 2016

@dotnet-bot test Windows_NT arm64 release pri1r2r

@kyulee1
Copy link
Author

kyulee1 commented Jun 16, 2016

@dotnet-bot test Windows_NT arm64 release

@Maoni0
Copy link
Member

Maoni0 commented Jun 16, 2016

I looked at issue 4877 and I am not clear on exactly how this addresses that. So you believe that the ref we were missing to report is in this 8 bytes that we skipped?

@briansull
Copy link

With the off by 8 pointer address we can mark the wrong card table entry.
If we update the wrong card table entry then we can potentially miss a GC root and collect a live object.

@Maoni0
Copy link
Member

Maoni0 commented Jun 16, 2016

Yeah, I get that. But I am not sure how this is related to issue 4877 which talks about we are missing reporting a stack location.

@kyulee1
Copy link
Author

kyulee1 commented Jun 16, 2016

@Maoni0 Yes.
4877 had two issues. The immediate issue discussed about GC hole by wrong GC report in indirect call site was fixed by 5789 before.
The call stack shown in 4877 was still there, which this fix addressed. This happens with large allocations.

@Maoni0
Copy link
Member

Maoni0 commented Jun 16, 2016

Makes sense then! The fix looks good to me.

@jkotas
Copy link
Member

jkotas commented Jun 16, 2016

I understand why you would like to have the increment for ByRef write barrier. It sounds fine - it is something we always had.

But having the increment in regular write barrier - that is much hotter than ByRef write barrier - is just wasting an instruction.

@kyulee1
Copy link
Author

kyulee1 commented Jun 16, 2016

I deleted the increment of address for the regular write barrier while keeping them for the byref write barrier.

@dotnet-bot test Windows_NT arm64 Checked
@dotnet-bot test Windows_NT arm64 Release
@dotnet-bot test Windows_NT arm64 Checked pr1r2r
@dotnet-bot test Windows_NT arm64 Release pr1r2r

@kyulee1
Copy link
Author

kyulee1 commented Jun 16, 2016

@dotnet-bot test Windows_NT arm64 Checked

@kyulee1
Copy link
Author

kyulee1 commented Jun 16, 2016

@dotnet-bot test Windows_NT arm64 release pri1r2r

I think this finally fixes most of long outstanding GC issues that appear in different
forms so far that include #4877, #4879, dotnet#4890 and more.

The issues was WriteBarrier implementation.
When we update card table, the address has been already incremented by 8.

With this fix, all corefx tests seem to run on xunit framework without crashing.
I verified this with an out-of-date build locally, so there are not found
assembly errors like System.Runtime.Extensions with different versions, though.
These are what I tested so far that safely finished all runs with very high percentage of pass.
System.Collections.Tests
System.Dynamic.Runtime.Tests
System.Linq.Expressions.Tests
System.Linq.Parallel.Tests

Potentially more coreclr tests will pass. I will update them as tests go.
@kyulee1
Copy link
Author

kyulee1 commented Jun 16, 2016

Without incrementing address for the normal writer barrier, I got over 900 failures -- http://dotnet-ci.cloudapp.net/job/dotnet_coreclr/job/master/job/arm64_cross_checked_windows_nt_prtest/250/consoleText
There may be a code path that relies on this. So, I keep this current fix as it is -- we may come back later to improve this with a GH issue.

@kyulee1 kyulee1 merged commit 9c50443 into dotnet:master Jun 16, 2016
@kyulee1 kyulee1 deleted the fixwb branch June 16, 2016 18:08
@jkotas
Copy link
Member

jkotas commented Jun 16, 2016

I have opened https://github.com/dotnet/coreclr/issues/5833 on getting this cleaned up.

jkotas pushed a commit that referenced this pull request Feb 5, 2017
jorive pushed a commit to guhuro/coreclr that referenced this pull request May 4, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants