Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Fix NRE caused by WeakReference collected on JS interop boundary. #54453

Merged
merged 1 commit into from
Jul 15, 2021

Conversation

pavelsavara
Copy link
Member

@pavelsavara pavelsavara commented Jun 19, 2021

  • Added GCHandle Normal while JSObject is in flight over JS/Managed interop boundary
  • Added new unit test for iterator.

Fixes #53872
Fixes #53957
Fixes #54655

@ghost
Copy link

ghost commented Jun 19, 2021

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

Demonstrates #53872 on simple case

Author: pavelsavara
Assignees: -
Labels:

arch-wasm, area-System.Runtime.InteropServices.JavaScript

Milestone: -

@pavelsavara
Copy link
Member Author

[15:39:23] fail: [FAIL] System.Runtime.InteropServices.JavaScript.Tests.JavaScriptTests.Iterator
[15:39:23] info: System.Exception : Object reference not set to an instance of an object. At attempt=65516
[15:39:23] info: ---- System.NullReferenceException : Object reference not set to an instance of an object.
[15:39:23] info:    at System.Runtime.InteropServices.JavaScript.Tests.JavaScriptTests.Iterator()
[15:39:23] info: --- End of stack trace from previous location ---
[15:39:23] info: ----- Inner Stack Trace -----
[15:39:23] info:    at System.Runtime.InteropServices.JavaScript.Tests.JavaScriptTests.ToEnumerable(JSObject iterrator)+MoveNext()
[15:39:23] info:    at System.Linq.Enumerable.Count[Object](IEnumerable`1 source)
[15:39:23] info:    at System.Runtime.InteropServices.JavaScript.Tests.JavaScriptTests.Iterator()

See also
https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-54453-merge-8d232ae9d0ad44ad84/System.Private.Runtime.InteropServices.JavaScript.Tests/console.24bbb537.log?sv=2019-07-07&se=2021-07-09T15%3A26%3A50Z&sr=c&sp=rl&sig=LIMNicCFVPwVfiBNZhgnApW0L%2BFV0NziNhfxzLF5afE%3D

@pavelsavara pavelsavara changed the title [wasm]New unit test for JS iterator [wasm] Fix NRE caused by WeakReference collected on JS interop boundary. Jun 21, 2021
@pavelsavara pavelsavara marked this pull request as ready for review June 21, 2021 15:04
@pavelsavara pavelsavara requested a review from marek-safar June 21, 2021 15:06
@pavelsavara
Copy link
Member Author

Open questions

Alternative implementation, pass false from mono_wasm_register_obj

@kg
Copy link
Member

kg commented Jun 21, 2021

I agree that the ownsHandle logic looks wrong, at least based on convention where an ownsHandle=true usually means 'this new object owns the underlying handle'.

@pavelsavara
Copy link
Member Author

I just found this comment in the original PR

/// <param name="ownsHandle">Whether or not the handle is owned by the clr or not.</param>

@kg
Copy link
Member

kg commented Jun 21, 2021

That doesn't really clarify it at all :-)

@pavelsavara pavelsavara marked this pull request as draft June 22, 2021 08:29
@pavelsavara pavelsavara added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jun 25, 2021
@pavelsavara pavelsavara force-pushed the wasm_iterrator_NRE branch 5 times, most recently from fcbb9ab to dc2108f Compare July 2, 2021 12:07
@pavelsavara
Copy link
Member Author

Note, we could not use the counter on the SafeHandle base class, because it could not be increased after the handle was released once. The same instance of JSObject could be passed from JS space multiple times, even on multiple calls.

@pavelsavara pavelsavara removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jul 2, 2021
@pavelsavara pavelsavara marked this pull request as ready for review July 2, 2021 12:27
@pavelsavara
Copy link
Member Author

This doesn't address potential issue with naked pointers together with compacting GC. I have not seen that to manifest yet.
I drafted code for adding in-flight objects here pavelsavara@200ac14.
But I believe that proper solution is to pass GCHandles everywhere, not pointers nor GC roots .

Copy link
Member

@lewing lewing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically *InFlight is a strong ref held on top of the weak one across the transitions where we need to keep a strong ref?

@pavelsavara
Copy link
Member Author

So basically *InFlight is a strong ref held on top of the weak one across the transitions where we need to keep a strong ref?

Exactly. We keep the weak ref in place, because it's handle id has longer life-cycle.

Copy link
Member

@kg kg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see style issues

@pavelsavara pavelsavara requested a review from lewing July 13, 2021 16:42
@pavelsavara
Copy link
Member Author

Failure of Http2_MultipleConnectionsEnabled_InfiniteRequestsCompletelyBlockOneConnection_RemaningRequestsAreHandledByNewConnection is on alpine, not related to browser only change.

@lewing
Copy link
Member

lewing commented Jul 14, 2021

ci failure looks like #45204 which has been disabled now

@lewing
Copy link
Member

lewing commented Jul 14, 2021

@pavelsavara can you resolve the conflict

@pavelsavara
Copy link
Member Author

Rebased, let's see how it works together with rooting. It should be just fine.

@lewing
Copy link
Member

lewing commented Jul 14, 2021

two hits on #55536

@lewing
Copy link
Member

lewing commented Jul 15, 2021

runtime (Libraries Test Run release mono Linux x64 Debug) is

/root/helix/work/workitem /root/helix/work/workitem
  Discovering: System.Net.Quic.Functional.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Quic.Functional.Tests (found 102 of 117 test cases)
  Starting:    System.Net.Quic.Functional.Tests (parallel test collections = on, max threads = 2)
   System.Net.Quic.Functional.Tests: [Long Running Test] 'System.Net.Quic.Tests.QuicStreamTests_MsQuicProvider.ReadWrite_Random_Success', Elapsed: 00:02:11
[Long Running Test] 'System.Net.Quic.Tests.MsQuicQuicStreamConformanceTests.ReadWrite_Success', Elapsed: 00:02:16
   System.Net.Quic.Functional.Tests: [Long Running Test] 'System.Net.Quic.Tests.QuicStreamTests_MsQuicProvider.ReadWrite_Random_Success', Elapsed: 00:04:11
[Long Running Test] 'System.Net.Quic.Tests.MsQuicQuicStreamConformanceTests.ReadWrite_Success', Elapsed: 00:04:16
   System.Net.Quic.Functional.Tests: [Long Running Test] 'System.Net.Quic.Tests.QuicStreamTests_MsQuicProvider.ReadWrite_Random_Success', Elapsed: 00:06:11
[Long Running Test] 'System.Net.Quic.Tests.MsQuicQuicStreamConformanceTests.ReadWrite_Success', Elapsed: 00:06:16
   System.Net.Quic.Functional.Tests: [Long Running Test] 'System.Net.Quic.Tests.QuicStreamTests_MsQuicProvider.ReadWrite_Random_Success', Elapsed: 00:08:11
[Long Running Test] 'System.Net.Quic.Tests.MsQuicQuicStreamConformanceTests.ReadWrite_Success', Elapsed: 00:08:16
   System.Net.Quic.Functional.Tests: [Long Running Test] 'System.Net.Quic.Tests.QuicStreamTests_MsQuicProvider.ReadWrite_Random_Success', Elapsed: 00:10:11
[Long Running Test] 'System.Net.Quic.Tests.MsQuicQuicStreamConformanceTests.ReadWrite_Success', Elapsed: 00:10:16
   System.Net.Quic.Functional.Tests: [Long Running Test] 'System.Net.Quic.Tests.QuicStreamTests_MsQuicProvider.ReadWrite_Random_Success', Elapsed: 00:12:11
[Long Running Test] 'System.Net.Quic.Tests.MsQuicQuicStreamConformanceTests.ReadWrite_Success', Elapsed: 00:12:16
   System.Net.Quic.Functional.Tests: [Long Running Test] 'System.Net.Quic.Tests.QuicStreamTests_MsQuicProvider.ReadWrite_Random_Success', Elapsed: 00:14:11
[Long Running Test] 'System.Net.Quic.Tests.MsQuicQuicStreamConformanceTests.ReadWrite_Success', Elapsed: 00:14:16

...
[EXECUTION TIMED OUT]
Exit Code:-3Executor timed out after 900 seconds and was killed
['System.Net.Quic.Functional.Tests' END OF WORK ITEM LOG: Command timed out, and was killed]

which sounds like #55642

@lewing
Copy link
Member

lewing commented Jul 15, 2021

runtime (Mono llvmaot Pri0 Runtime Tests Run Linux arm64 release) is

Traceback (most recent call last):
  File "/root/helix/work/correlation/reporter/run.py", line 13, in <module>
    from azure_devops_result_publisher import AzureDevOpsTestResultPublisher
  File "/root/helix/work/correlation/reporter/azure_devops_result_publisher.py", line 7, in <module>
    from azure.devops.connection import Connection
ModuleNotFoundError: No module named 'azure'
+ export _uploaderExitCode=1
+ date -u +%FT%TZ
2021-07-14T17:05:31Z
+ export PYTHONPATH=
+ exit 1

@lewing
Copy link
Member

lewing commented Jul 15, 2021

/azp run

@azure-pipelines
Copy link

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

@lewing
Copy link
Member

lewing commented Jul 15, 2021

it hasn't restarted in 40 minutes and non of the failures were relevant to a browser only change.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants