Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Premature connection close #55903

Closed
ManickaP opened this issue Jul 19, 2021 · 5 comments
Closed

Premature connection close #55903

ManickaP opened this issue Jul 19, 2021 · 5 comments

Comments

@ManickaP
Copy link
Member

ManickaP commented Jul 19, 2021

Collected from different comments on different issues:

Stream open fails with INVALID_STATE

Saw something similar on stream today

  System.Net.Quic.Tests.MockQuicStreamConformanceTests.ReadWrite_Success(mode: SyncSpan, writeSize: 1, startWithFlush: True) [STARTING]
      System.AggregateException : One or more errors occurred. (One or more errors occurred. (Operation aborted.)) (One or more errors occurred. (Failed to open stream to peer. Error Code: INVALID_STATE))
      ---- System.AggregateException : One or more errors occurred. (Operation aborted.)
      -------- System.Net.Quic.QuicOperationAbortedException : Operation aborted.
      ---- System.AggregateException : One or more errors occurred. (Failed to open stream to peer. Error Code: INVALID_STATE)
      -------- System.Net.Quic.QuicException : Failed to open stream to peer. Error Code: INVALID_STATE
      Stack Trace:
        /home/furt/github/wfurt-runtime/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(80,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)
        /home/furt/github/wfurt-runtime/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(49,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks, Int32 millisecondsTimeout)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(132,0): at System.Net.Quic.Tests.QuicTestBase`1.RunClientServer(Func`2 clientFunction, Func`2 serverFunction, Int32 iterations, Int32 millisecondsTimeout, QuicListenerOptions listenerOptions)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicConnectionTests.cs(136,0): at System.Net.Quic.Tests.QuicConnectionTests`1.Dispose_WithOpenLocalStream_LocalStreamFailsWithQuicOperationAbortedException(Int32 writesBeforeClose)
        --- End of stack trace from previous location ---
        ----- Inner Stack Trace #1 (System.AggregateException) -----

        ----- Inner Stack Trace -----
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs(455,0): at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.AcceptStreamAsync(CancellationToken cancellationToken)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/QuicConnection.cs(98,0): at System.Net.Quic.QuicConnection.AcceptStreamAsync(CancellationToken cancellationToken)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicConnectionTests.cs(152,0): at System.Net.Quic.Tests.QuicConnectionTests`1.<>c__DisplayClass6_0.<<Dispose_WithOpenLocalStream_LocalStreamFailsWithQuicOperationAbortedException>b__1>d.MoveNext()
        --- End of stack trace from previous location ---
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(137,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass22_0.<<RunClientServer>b__0>d.MoveNext()
        --- End of stack trace from previous location ---
        /home/furt/github/wfurt-runtime/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(63,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)
        ----- Inner Stack Trace #2 (System.AggregateException) -----

        ----- Inner Stack Trace -----
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/Internal/QuicExceptionHelpers.cs(12,0): at System.Net.Quic.Implementations.MsQuic.Internal.QuicExceptionHelpers.ThrowIfFailed(UInt32 status, String message, Exception innerException)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicStream.cs(163,0): at System.Net.Quic.Implementations.MsQuic.MsQuicStream..ctor(State connectionState, QUIC_STREAM_OPEN_FLAGS flags)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs(528,0): at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.OpenBidirectionalStream()
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/QuicConnection.cs(92,0): at System.Net.Quic.QuicConnection.OpenBidirectionalStream()
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicConnectionTests.cs(139,0): at System.Net.Quic.Tests.QuicConnectionTests`1.<>c__DisplayClass6_0.<<Dispose_WithOpenLocalStream_LocalStreamFailsWithQuicOperationAbortedException>b__0>d.MoveNext()
        --- End of stack trace from previous location ---
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(147,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass22_0.<<RunClientServer>b__1>d.MoveNext()
        --- End of stack trace from previous location ---
        /home/furt/github/wfurt-runtime/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(55,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)

@wfurt According to msquic code https://github.com/microsoft/msquic/blob/3898b2b88085a478eeb844885ea78daa9b060d06/src/core/stream.c#L207-L211:

    if (QuicConnIsClosed(Stream->Connection) ||
        Stream->Flags.Started) {
        Status = QUIC_STATUS_INVALID_STATE;
        goto Exit;
    }

Your error means the connection is closed. It just looks like another symptom of premature connection close, reported in this comment #55380 (comment)

Native crash

While diffing through, I founds another crashing pattern:

frame #3: 0x00007fff47d487e3 libmsquic.so`QuicOperationEnqueue(OperQ=0x00007fff2805f1c0, Oper=0x00007fff2c082770) at operation.c:138
   135 	    )
   136 	{
   137 	    BOOLEAN StartProcessing;
-> 138 	    CxPlatDispatchLockAcquire(&OperQ->Lock);
   139 	#if DEBUG
   140 	    CXPLAT_DBG_ASSERT(Oper->Link.Flink == NULL);
   141 	#endif

(QUIC_OPERATION) $4 = {
  Link = {
    Flink = 0x0000000000000000
    Blink = 0x00007fff00000000
  }
  Type = QUIC_OPER_TYPE_API_CALL
  FreeAfterProcess = '\x01'
   = {
    INITIALIZE = (Reserved = 0x00007fff6800ce80)
    API_CALL = {
      Context = 0x00007fff6800ce80
    }
    FLUSH_RECEIVE = (Reserved = 0x00007fff6800ce80)
    UNREACHABLE = {
      RemoteAddress = {
        Ip = (sa_family = 52864, sa_data = char [14] @ 0x0000563016f514ea)
        Ipv4 = {
          sin_family = 52864
          sin_port = 26624
          sin_addr = (s_addr = 32767)
          sin_zero = {
            [0] = '\0'
            [1] = '\0'
            [2] = '\0'
            [3] = '\0'
            [4] = '\0'
            [5] = '\0'
            [6] = '\0'
            [7] = '\0'
          }
        }
        Ipv6 = {
          sin6_family = 52864
          sin6_port = 26624
          sin6_flowinfo = 32767
          sin6_addr = {
            __in6_u = {
              __u6_addr8 = {
                [0] = '\0'
                [1] = '\0'
                [2] = '\0'
                [3] = '\0'
                [4] = '\0'
                [5] = '\0'
                [6] = '\0'
                [7] = '\0'
                [8] = '\0'
                [9] = '\0'
                [10] = '\0'
                [11] = '\0'
                [12] = '\0'
                [13] = '\0'
                [14] = '\0'
                [15] = '\0'
              }
              __u6_addr16 = ([0] = 0, [1] = 0, [2] = 0, [3] = 0, [4] = 0, [5] = 0, [6] = 0, [7] = 0)
              __u6_addr32 = ([0] = 0, [1] = 0, [2] = 0, [3] = 0)
            }
          }
          sin6_scope_id = 0
        }
      }
    }
    FLUSH_STREAM_RECEIVE = {
      Stream = 0x00007fff6800ce80
    }
    FLUSH_SEND = (Reserved = 0x00007fff6800ce80)
    TIMER_EXPIRED = (Type = 1744883328)
    STATELESS = {
      Context = 0x00007fff6800ce80
    }
  }
}

(lldb) p OperQ->Lock
(CXPLAT_DISPATCH_LOCK) $2 = {
  Mutex = {
    __data = {
      __lock = 0
      __count = 0
      __owner = 0
      __nusers = 0
      __kind = -1
      __spins = 0
      __elision = 0
      __list = {
        __prev = 0x0000000000000000
        __next = 0x0000000000000000
      }
    }
    __size = {
      [0] = '\0'
      [1] = '\0'
      [2] = '\0'
      [3] = '\0'
      [4] = '\0'
      [5] = '\0'
      [6] = '\0'
      [7] = '\0'
      [8] = '\0'
      [9] = '\0'
      [10] = '\0'
      [11] = '\0'
      [12] = '\0'
      [13] = '\0'
      [14] = '\0'
      [15] = '\0'
      [16] = '\xff'
      [17] = '\xff'
      [18] = '\xff'
      [19] = '\xff'
      [20] = '\0'
      [21] = '\0'
      [22] = '\0'
      [23] = '\0'
      [24] = '\0'
      [25] = '\0'
      [26] = '\0'
      [27] = '\0'
      [28] = '\0'
      [29] = '\0'
      [30] = '\0'
      [31] = '\0'
      [32] = '\0'
      [33] = '\0'
      [34] = '\0'
      [35] = '\0'
      [36] = '\0'
      [37] = '\0'
      [38] = '\0'
      [39] = '\0'
    }
    __align = 0
  }
}

the lock seems to be all empty.

Managed crash

Got different exception, this time managed one:

Process terminated. Assertion failed.
[Connection#58596441] Exception occurred during handling PEER_STREAM_STARTED connection callback: System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'QuicConnection'.
   at System.Net.Quic.Implementations.MsQuic.MsQuicStream..ctor(State connectionState, SafeMsQuicStreamHandle streamHandle, QUIC_STREAM_OPEN_FLAGS flags) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicStream.cs:line 106
   at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.State.TryQueueNewStream(SafeMsQuicStreamHandle streamHandle, QUIC_STREAM_OPEN_FLAGS flags) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs:line 92
   at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.HandleEventNewStream(State state, ConnectionEvent& connectionEvent) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs:line 285
   at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.NativeCallbackHandler(IntPtr connection, IntPtr context, ConnectionEvent& connectionEvent) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs:line 624
   at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.NativeCallbackHandler(IntPtr connection, IntPtr context, ConnectionEvent& connectionEvent) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs:line 624
/home/manicka/.scripts/run-tests.sh: line 16: 79009 Aborted
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Net.Quic untriaged New issue has not been triaged by the area owner labels Jul 19, 2021
@ghost
Copy link

ghost commented Jul 19, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Collected from different comments on different issues:

Saw something similar on stream today

  System.Net.Quic.Tests.MockQuicStreamConformanceTests.ReadWrite_Success(mode: SyncSpan, writeSize: 1, startWithFlush: True) [STARTING]
      System.AggregateException : One or more errors occurred. (One or more errors occurred. (Operation aborted.)) (One or more errors occurred. (Failed to open stream to peer. Error Code: INVALID_STATE))
      ---- System.AggregateException : One or more errors occurred. (Operation aborted.)
      -------- System.Net.Quic.QuicOperationAbortedException : Operation aborted.
      ---- System.AggregateException : One or more errors occurred. (Failed to open stream to peer. Error Code: INVALID_STATE)
      -------- System.Net.Quic.QuicException : Failed to open stream to peer. Error Code: INVALID_STATE
      Stack Trace:
        /home/furt/github/wfurt-runtime/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(80,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)
        /home/furt/github/wfurt-runtime/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(49,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks, Int32 millisecondsTimeout)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(132,0): at System.Net.Quic.Tests.QuicTestBase`1.RunClientServer(Func`2 clientFunction, Func`2 serverFunction, Int32 iterations, Int32 millisecondsTimeout, QuicListenerOptions listenerOptions)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicConnectionTests.cs(136,0): at System.Net.Quic.Tests.QuicConnectionTests`1.Dispose_WithOpenLocalStream_LocalStreamFailsWithQuicOperationAbortedException(Int32 writesBeforeClose)
        --- End of stack trace from previous location ---
        ----- Inner Stack Trace #1 (System.AggregateException) -----

        ----- Inner Stack Trace -----
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs(455,0): at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.AcceptStreamAsync(CancellationToken cancellationToken)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/QuicConnection.cs(98,0): at System.Net.Quic.QuicConnection.AcceptStreamAsync(CancellationToken cancellationToken)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicConnectionTests.cs(152,0): at System.Net.Quic.Tests.QuicConnectionTests`1.<>c__DisplayClass6_0.<<Dispose_WithOpenLocalStream_LocalStreamFailsWithQuicOperationAbortedException>b__1>d.MoveNext()
        --- End of stack trace from previous location ---
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(137,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass22_0.<<RunClientServer>b__0>d.MoveNext()
        --- End of stack trace from previous location ---
        /home/furt/github/wfurt-runtime/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(63,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)
        ----- Inner Stack Trace #2 (System.AggregateException) -----

        ----- Inner Stack Trace -----
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/Internal/QuicExceptionHelpers.cs(12,0): at System.Net.Quic.Implementations.MsQuic.Internal.QuicExceptionHelpers.ThrowIfFailed(UInt32 status, String message, Exception innerException)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicStream.cs(163,0): at System.Net.Quic.Implementations.MsQuic.MsQuicStream..ctor(State connectionState, QUIC_STREAM_OPEN_FLAGS flags)
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs(528,0): at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.OpenBidirectionalStream()
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/QuicConnection.cs(92,0): at System.Net.Quic.QuicConnection.OpenBidirectionalStream()
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicConnectionTests.cs(139,0): at System.Net.Quic.Tests.QuicConnectionTests`1.<>c__DisplayClass6_0.<<Dispose_WithOpenLocalStream_LocalStreamFailsWithQuicOperationAbortedException>b__0>d.MoveNext()
        --- End of stack trace from previous location ---
        /home/furt/github/wfurt-runtime/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(147,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass22_0.<<RunClientServer>b__1>d.MoveNext()
        --- End of stack trace from previous location ---
        /home/furt/github/wfurt-runtime/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(55,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)

Originally posted by @wfurt in #53224 (comment)

Author: ManickaP
Assignees: -
Labels:

area-System.Net.Quic, untriaged

Milestone: -

@ManickaP ManickaP removed the untriaged New issue has not been triaged by the area owner label Jul 19, 2021
@ManickaP ManickaP added this to the 6.0.0 milestone Jul 19, 2021
@ManickaP
Copy link
Member Author

Seeing one of the crashes again:

Process terminated. Assertion failed.
[conn][0x7F51C403E480] Exception occurred during handling PEER_STREAM_STARTED connection callback: System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'QuicConnection'.
   at System.Net.Quic.Implementations.MsQuic.MsQuicStream..ctor(State connectionState, SafeMsQuicStreamHandle streamHandle, QUIC_STREAM_OPEN_FLAGS flags) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicStream.cs:line 122
   at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.State.TryQueueNewStream(SafeMsQuicStreamHandle streamHandle, QUIC_STREAM_OPEN_FLAGS flags) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs:line 99
   at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.HandleEventNewStream(State state, ConnectionEvent& connectionEvent) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs:line 296
   at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.NativeCallbackHandler(IntPtr connection, IntPtr context, ConnectionEvent& connectionEvent) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs:line 672
   at System.Net.Quic.Implementations.MsQuic.MsQuicConnection.NativeCallbackHandler(IntPtr connection, IntPtr context, ConnectionEvent& connectionEvent) in /home/manicka/repositories/runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs:line 672
/home/manicka/.scripts/run-tests.sh: line 22: 527853 Aborted                 (core dumped) ~/repositories/${runtime}/artifacts/bin/testhost/net6.0-Linux-Debug-x64/dotnet exec --runtimeconfig System.Net.${testLib}.Functional.Tests.runtimeconfig.json --depsfile System.Net.${testLib}.Functional.Tests.deps.json xunit.console.dll System.Net.${testLib}.Functional.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing
Failed after 79 runs

I ran S.N.Http tests in a loop, it crashed on 79th run. I've had these tests locally re-enabled:

HttpClientHandlerTest_Http3.ClientSettingsReceived_Success
HttpClientHandlerTest_Http3.SendStreamLimitRequestsConcurrently_Succeeds
HttpClientHandlerTest_Http3.SendMoreThanStreamLimitRequestsConcurrently_LastWaits

@geoffkizer
Copy link
Contributor

This seems like it's one of the issues we identified in the code review -- specifically that we call TryAddStream too late, here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicStream.cs#L119

It seems like we should call TryAddStream before we call the MsQuicStream constructor. And we shouldn't throw if it fails, we should just close the stream, since this indicates that the connection is shut down.

This may not completely fix the issue, but it would at least get us past the assertion. And we should never be throwing like this in a callback anyway.

@wfurt
Copy link
Member

wfurt commented Jul 25, 2021

I'm not sure if moving TryAddStream would help. Since we did not sassing to _state.ConnectionState ref-counting should work right and if somebody call Dispose or we shut the connection for other reason TryAddStream . This seems like general race condition when Connection is being disposed while creating stream. Do I miss something @geoffkizer ?
This is IMHO true also for the INVALID_STATE.

@ManickaP
Copy link
Member Author

Triage: we haven't seen this is a while, but it might be just due to multitude of disabled tests.
I'll close this for now, since we already have #55103 for future to redesign the dispose/finalization/closing logic.

@ghost ghost locked as resolved and limited conversation to collaborators Sep 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants