Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SignalR Hub failing to send a message due to connection problems #675

Closed
dmitrysamuylovpharo opened this issue Oct 1, 2019 · 9 comments
Closed

Comments

@dmitrysamuylovpharo
Copy link

We are running into a problem of failed SignalR message sends from within an Azure hosted WebSite based asp.net core web-service. This web-service also implements and hosts the SignalR.Hub which connects to an Azure SignalR instance. From within one of the controllers in this web-service after performing some operations we try to send a SignalR message to notify all connected clients that an important operation has been completed and they can handle this accordingly. This message send periodically fails in a way that is reminiscent of several issues reported on here in the past, I hope someone can help me identify where the problem lies and how to prevent this from happening.

I understand that disconnects can and do occur from time to time, but the SignalR connection will automatically reconnect, if I understand correctly. I saw mention/recommendation for the clients to have handling for re-connecting and we have that in place on all our clients, this seems to be functioning ok. In this case the connection issue seems to be within the SignalR Hub to the Azure SignalR

Here are the errors I'm seeing in my Application Insights for our web-service that hosts the SignalR Hub at the time of one of these failed message sends:

  timestamp [UTC] 2019-09-20T05:16:27.657992Z
  problemId System.Net.Sockets.SocketException at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
  type System.Net.Sockets.SocketException
  assembly System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e
  method System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
  outerType System.IO.IOException
  outerMessage Error while sending a message.
  outerAssembly System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e
  outerMethod System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
  innermostType System.Net.Sockets.SocketException
  innermostMessage An existing connection was forcibly closed by the remote host
  severityLevel 3
  details [{"parsedStack":[{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":0,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":1,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":2,"line":0},{"assembly":"System.Net.Security, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.Security.SslStreamInternal+<g__CompleteAsync|36_1>d1.MoveNext","level":3,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":4,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":5,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":6,"line":0},{"assembly":"System.Net.Security, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.Security.SslStreamInternal+<<WriteAsyncInternal>g__ExitWriteAsync|35_0>d1.MoveNext","level":7,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":8,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":9,"line":0},{"assembly":"Microsoft.AspNetCore.Http.Connections.Client, Version=1.0.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60","method":"Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport+d__20.MoveNext","level":10,"line":0}],"outerId":"0","message":"Error while sending a message.","severityLevel":"Error","type":"System.IO.IOException","id":"14093239"},{"outerId":"14093239","message":"An existing connection was forcibly closed by the remote host","severityLevel":"Error","type":"System.Net.Sockets.SocketException","id":"33060543"}]
  itemType exception
  customDimensions {"CategoryName":"Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport","AspNetCoreEnvironment":"Production","{OriginalFormat}":"Error while sending a message.","Exception":"System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host\r\n --- End of inner exception stack trace ---\r\n at System.Net.Security.SslStreamInternal.g__CompleteAsync|36_1[TWriteAdapter](ValueTask writeTask, Byte[] bufferToReturn)\r\n at System.Net.Security.SslStreamInternal.g__ExitWriteAsync|35_0[TWriteAdapter](ValueTask task)\r\n at Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport.StartSending(WebSocket socket)"}
  application_Version 4.0.0.0
  client_Type PC
  client_IP 0.0.0.0
  client_City Washington
  client_StateOrProvince Virginia
  client_CountryOrRegion United States
  iKey dbe46f33-04cc-405e-9e0b-31dd60f42f90
  sdkVersion ar_ilc:2.6.1
  itemId d7ca25c0-db65-11e9-a4bc-d1d8b00a2fe5
  itemCount  

Next it reported a few instances of this error:

  timestamp [UTC] 2019-09-20T05:16:29.2778395Z
  problemId System.Net.Sockets.SocketException at System.Net.WebSockets.ManagedWebSocket+d__65`2.MoveNext
  type System.Net.Sockets.SocketException
  assembly System.Net.WebSockets, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
  method System.Net.WebSockets.ManagedWebSocket+d__65`2.MoveNext
  outerType System.Net.WebSockets.WebSocketException
  outerMessage Connection 8f916f18-2d11-496a-bdff-a9dad81eb434 to the service was dropped.
  outerAssembly System.Net.WebSockets, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
  outerMethod System.Net.WebSockets.ManagedWebSocket+d__65`2.MoveNext
  innermostType System.Net.Sockets.SocketException
  innermostMessage A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
  severityLevel 3
  details [{"parsedStack":[{"assembly":"System.Net.WebSockets, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.WebSockets.ManagedWebSocket+d__652.MoveNext","level":0,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":1,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":2,"line":0},{"assembly":"Microsoft.AspNetCore.Http.Connections.Client, Version=1.0.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60","method":"Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport+<StartReceiving>d__19.MoveNext","level":3,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":4,"line":0},{"assembly":"System.IO.Pipelines, Version=4.0.0.1, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51","method":"System.IO.Pipelines.PipeCompletion.ThrowLatchedException","level":5,"line":0},{"assembly":"System.IO.Pipelines, Version=4.0.0.1, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51","method":"System.IO.Pipelines.Pipe.GetReadResult","level":6,"line":0},{"assembly":"System.IO.Pipelines, Version=4.0.0.1, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51","method":"System.IO.Pipelines.Pipe.GetReadAsyncResult","level":7,"line":0},{"assembly":"System.IO.Pipelines, Version=4.0.0.1, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51","method":"System.IO.Pipelines.Pipe+DefaultPipeReader.GetResult","level":8,"line":0},{"assembly":"Microsoft.Azure.SignalR.Common, Version=1.0.7.0, Culture=neutral, PublicKeyToken=adb9793829ddae60","method":"Microsoft.Azure.SignalR.ServiceConnectionBase+<ProcessIncomingAsync>d__53.MoveNext","level":9,"line":0}],"outerId":"0","message":"Connection 8f916f18-2d11-496a-bdff-a9dad81eb434 to the service was dropped.","severityLevel":"Error","type":"System.Net.WebSockets.WebSocketException","id":"3247361"},{"parsedStack":[{"assembly":"System.Net.Sockets, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.Sockets.Socket+AwaitableSocketAsyncEventArgs.ThrowException","level":0,"line":0},{"assembly":"System.Net.Sockets, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.Sockets.Socket+AwaitableSocketAsyncEventArgs.GetResult","level":1,"line":0},{"assembly":"System.Net.Security, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.Security.SslStreamInternal+<<FillBufferAsync>g__InternalFillBufferAsync|38_0>d1.MoveNext","level":2,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":3,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":4,"line":0},{"assembly":"System.Net.Security, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.Security.SslStreamInternal+d__341.MoveNext","level":5,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":6,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":7,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":8,"line":0},{"assembly":"System.Net.Http, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.Http.HttpConnection+<ReadBufferedAsyncCore>d__95.MoveNext","level":9,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":10,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":11,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":12,"line":0},{"assembly":"System.Net.Http, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.Http.HttpConnection+RawConnectionStream+<ReadAsync>d__1.MoveNext","level":13,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":14,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":15,"line":0},{"assembly":"System.Net.WebSockets, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.WebSockets.ManagedWebSocket+<EnsureBufferContainsAsync>d__75.MoveNext","level":16,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":17,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":18,"line":0},{"assembly":"System.Net.WebSockets, Version=4.1.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a","method":"System.Net.WebSockets.ManagedWebSocket+<ReceiveAsyncPrivate>d__652.MoveNext","level":19,"line":0}],"outerId":"3247361","message":"Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.","severityLevel":"Error","type":"System.IO.IOException","id":"46617528"},{"outerId":"46617528","message":"A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond","severityLevel":"Error","type":"System.Net.Sockets.SocketException","id":"53239919"}]
  itemType exception
  customDimensions {"CategoryName":"Microsoft.Azure.SignalR.ServiceConnectionBase","AspNetCoreEnvironment":"Production","{OriginalFormat}":"Connection {ServiceConnectionId} to the service was dropped.","Exception":"System.Net.WebSockets.WebSocketException (0x80004005): The remote party closed the WebSocket connection without completing the close handshake. ---> System.IO.IOException: Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond\r\n --- End of inner exception stack trace ---\r\n at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error)\r\n at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)\r\n at System.Net.Security.SslStreamInternal.g__InternalFillBufferAsync|38_0[TReadAdapter](TReadAdapter adap, ValueTask1 task, Int32 min, Int32 initial)\r\n at System.Net.Security.SslStreamInternal.ReadAsyncInternal[TReadAdapter](TReadAdapter adapter, Memory1 buffer)\r\n at System.Net.Http.HttpConnection.ReadBufferedAsyncCore(Memory1 destination)\r\n at System.Net.Http.HttpConnection.RawConnectionStream.ReadAsync(Memory1 buffer, CancellationToken cancellationToken)\r\n at System.Net.WebSockets.ManagedWebSocket.EnsureBufferContainsAsync(Int32 minimumRequiredBytes, CancellationToken cancellationToken, Boolean throwOnPrematureClosure)\r\n at System.Net.WebSockets.ManagedWebSocket.ReceiveAsyncPrivate[TWebSocketReceiveResultGetter,TWebSocketReceiveResult](Memory1 payloadBuffer, CancellationToken cancellationToken, TWebSocketReceiveResultGetter resultGetter)\r\n at System.Net.WebSockets.ManagedWebSocket.ReceiveAsyncPrivate[TWebSocketReceiveResultGetter,TWebSocketReceiveResult](Memory1 payloadBuffer, CancellationToken cancellationToken, TWebSocketReceiveResultGetter resultGetter)\r\n at Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport.StartReceiving(WebSocket socket)\r\n at System.IO.Pipelines.PipeCompletion.ThrowLatchedException()\r\n at System.IO.Pipelines.Pipe.GetReadResult(ReadResult& result)\r\n at System.IO.Pipelines.Pipe.GetReadAsyncResult()\r\n at System.IO.Pipelines.Pipe.DefaultPipeReader.GetResult(Int16 token)\r\n at Microsoft.Azure.SignalR.ServiceConnectionBase.ProcessIncomingAsync()","ServiceConnectionId":"8f916f18-2d11-496a-bdff-a9dad81eb434"}

And then a few dozen of this error I'm assuming one for each connected client:

  timestamp [UTC] 2019-09-20T05:16:29.3464351Z
  problemId Microsoft.Azure.SignalR.Common.ServiceConnectionNotActiveException at Microsoft.Azure.SignalR.ServiceConnectionBase+d__40.MoveNext
  type Microsoft.Azure.SignalR.Common.ServiceConnectionNotActiveException
  assembly Microsoft.Azure.SignalR.Common, Version=1.0.7.0, Culture=neutral, PublicKeyToken=adb9793829ddae60
  method Microsoft.Azure.SignalR.ServiceConnectionBase+d__40.MoveNext
  outerType Microsoft.Azure.SignalR.Common.ServiceConnectionNotActiveException
  outerMessage Error while sending message to the service.
  outerAssembly Microsoft.Azure.SignalR.Common, Version=1.0.7.0, Culture=neutral, PublicKeyToken=adb9793829ddae60
  outerMethod Microsoft.Azure.SignalR.ServiceConnectionBase+d__40.MoveNext
  severityLevel 3
  details [{"parsedStack":[{"assembly":"Microsoft.Azure.SignalR.Common, Version=1.0.7.0, Culture=neutral, PublicKeyToken=adb9793829ddae60","method":"Microsoft.Azure.SignalR.ServiceConnectionBase+d__40.MoveNext","level":0,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":1,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":2,"line":0},{"assembly":"Microsoft.Azure.SignalR, Version=1.0.7.0, Culture=neutral, PublicKeyToken=adb9793829ddae60","method":"Microsoft.Azure.SignalR.ServiceConnection+d__14.MoveNext","level":3,"line":0}],"outerId":"0","message":"Error while sending message to the service.","severityLevel":"Error","type":"Microsoft.Azure.SignalR.Common.ServiceConnectionNotActiveException","id":"34441091"}]
  itemType exception
  customDimensions {"CategoryName":"Microsoft.Azure.SignalR.ServiceConnection","AspNetCoreEnvironment":"Production","{OriginalFormat}":"Error while sending message to the service.","Exception":"Microsoft.Azure.SignalR.Common.ServiceConnectionNotActiveException: The connection is not active, data cannot be sent to the service.\r\n at Microsoft.Azure.SignalR.ServiceConnectionBase.WriteAsync(ServiceMessage serviceMessage)\r\n at Microsoft.Azure.SignalR.ServiceConnection.ProcessOutgoingMessagesAsync(ServiceConnectionContext connection)"}
@KKhurin
Copy link
Contributor

KKhurin commented Oct 5, 2019

Thanks for reporting this @dmitrysamuylovpharo.

You're correct - the SDK on app server side will automatically reconnect dropped connections.
I noticed you're using an older version of Microsoft.Azure.SignalR (1.0.7.0). To eliminate some possible corner cases that have already been fixed, would you be willing to try the latest SDK bits?
If the errors still persist, then the next step would be to look at the service side traces.

@dmitrysamuylovpharo
Copy link
Author

@KKhurin Sure, I'll try updating to the latest SDKs and monitor for a bit, I'll add another post here once that's done with any news either way.

@dmitrysamuylovpharo
Copy link
Author

dmitrysamuylovpharo commented Nov 18, 2019

@KKhurin Hi, sorry for the delay, I updated all SDKs, in the web application hosting the SignalR hub and in the mobile clients to the latests versions. I still see the problem happening occasionally with a very similar pattern with several instances of this error each time it happens (I presume one for each client connected at the time):

Message Error while sending message to the service, the connection carrying the traffic is dropped. Error detail: The connection is not active, data cannot be sent to the service.  
Exception type Microsoft.Azure.SignalR.Common.ServiceConnectionNotActiveException  
Failed method Microsoft.Azure.SignalR.ServiceConnectionBase+d__46.MoveNext

Callstack:
Microsoft.Azure.SignalR.Common.ServiceConnectionNotActiveException: at Microsoft.Azure.SignalR.ServiceConnectionBase+<WriteAsync>d__46.MoveNext (Microsoft.Azure.SignalR.Common, Version=1.1.1.0, Culture=neutral, PublicKeyToken=adb9793829ddae60) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at Microsoft.Azure.SignalR.ServiceConnection+<ProcessOutgoingMessagesAsync>d__19.MoveNext (Microsoft.Azure.SignalR, Version=1.1.1.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)

What else can be done to debug this further to figure out why this is happening and not automatically reconnecting? Any suggestions?

@KKhurin
Copy link
Contributor

KKhurin commented Nov 22, 2019

Thank you @dmitrysamuylovpharo for trying the latest SDK. It should automatically reconnect after getting this error so I'll have to dig deeper to see if there is a potential for not reconnecting. I'll get back with the results of the investigation.

One thing to note is that in the current design messages can queue up for a particular service connection (if the App Server is sending them faster than the service can receive them) and once the connection is dropped each of these queued messages will fail with this error. In case when the number of the queued up messages is very large it may take a long time to log errors for each of them and thus create an impression that we keep using the same connection (while in fact the newly sent messages are already using the new healthy connection). We are aware of this problem and have plans to address this in the future releases (along with salvaging those queued message). In my tests this however only happens when the app servers manage to severely overwhelm the service. Could this be your case?

@dmitrysamuylovpharo
Copy link
Author

@KKhurin no, our message traffic is very low, can't possibly be this scenario, we only send a message to all the clients in response to an occasional event manually triggered by a user's action, this only happens a handful of times a day.

@KKhurin
Copy link
Contributor

KKhurin commented Nov 23, 2019

@dmitrysamuylovpharo thank you for the confirmation. I'll look further into it.

@dmitrysamuylovpharo
Copy link
Author

@KKhurin today got the same type of error scenario but now it is returning a slightly different error message, not sure if that's of any help:

Message Internal server transient error.  
Exception type Microsoft.Azure.SignalR.Common.ServiceConnectionNotActiveException  
Failed method Microsoft.Azure.SignalR.ServiceConnectionBase+d__46.MoveNext  
FormattedMessage Error while sending message to the service, the connection carrying the traffic is dropped. Error detail: Internal server transient error.  

@KKhurin
Copy link
Contributor

KKhurin commented Dec 3, 2019

@dmitrysamuylovpharo this indicates an error on the service side. We can take a look at the service logs to see what is causing it - if interested in getting to the bottom of this please email the time when it happened and your "ResourceId" to <my_git_alias>@microsoft.com.

However more importantly I am finally seeing a bug on SDK side which would lead to the App Server to keep using the faulted connection for some period of time after it gets an error from the service. The fix should be on its way shortly.

KKhurin added a commit that referenced this issue Dec 11, 2019
KKhurin added a commit that referenced this issue Dec 12, 2019
This was referenced Dec 13, 2019
JialinXin pushed a commit to JialinXin/azure-signalr that referenced this issue Dec 20, 2019
@vicancy
Copy link
Member

vicancy commented Aug 24, 2020

I believe this specific issue is already addressed. Please feel free to reopen the issue if you still have concerns about it.

@vicancy vicancy closed this as completed Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants