Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sshclient deadlock/freeze on disconnect #355

Closed
gary-holland opened this issue Dec 13, 2017 · 126 comments
Closed

Sshclient deadlock/freeze on disconnect #355

gary-holland opened this issue Dec 13, 2017 · 126 comments
Assignees

Comments

@gary-holland
Copy link

gary-holland commented Dec 13, 2017

Hi,

My code is locking when attempting to call the Disconnect function.

This snippet is an example of what is locking:

SshClient client = new SshClient("server", "user", "pass");
client.Connect();
client.Disconnect(); # freezes here

I'm connecting to a ssh connection running on a QNAP NAS server, with the server version listed as "SSH-2.0-OpenSSH_7.6".

This is on OSX (High Sierra). I just tried in Windows, and the issue doesn't exist there.

Any help would be much appreciated.

Thanks.

@keyant
Copy link

keyant commented Dec 18, 2017

I also encountered the same problem ( on OSX High Sierra, too).

I found the code is locked by this line: _messageListenerCompleted.WaitOne(); in the method Disconnect() of Session.cs.

@oktinaut
Copy link

oktinaut commented Jan 3, 2018

The problem occurs on Sierra as well (using .NET Core 2.0).

@HalfLegend
Copy link

HalfLegend commented Jan 11, 2018

Same issue, also OSX High Sierra
Not only sshClient, scpClient also have the issue.

Any update?

@HalfLegend
Copy link

HalfLegend commented Jan 11, 2018

Now I am using the following code to avoid the issue:

                Task.Factory.StartNew(() => {
                    sshClient.Dispose();
                });

and kill these threads later.

Though the threads are blocked, the socket resources are in fact freed successfully before _messageListenerCompleted.WaitOne() is called.

@JulianRooze
Copy link

JulianRooze commented Jan 18, 2018

We run into the same issue, but on Linux. Can confirm that it's waiting on _messageListenerCompleted.WaitOne() inside Disconnect.

We have a wrapper class around the SshClient and we worked around the issue by disconnecting the socket through reflection if necessary after a 2 second delay and setting the waithandle. The dispose method of our wrapper class:

public void Dispose()
{
  if (_client == null) return;

  Task.Run(() =>
  {
	_log.Debug("Disposing _client");

	var timer = new System.Timers.Timer();

	timer.Interval = 2000;
	timer.AutoReset = false;

	timer.Elapsed += (s, e) =>
	{
	  try
	  {
		var sessionField = _client.GetType().GetProperty("Session", BindingFlags.NonPublic | BindingFlags.Instance);

		if (sessionField != null)
		{
		  var session = sessionField.GetValue(_client);

		  if (session != null)
		  {
			var socketField = session.GetType().GetField("_socket", BindingFlags.NonPublic | BindingFlags.Instance);

			if (socketField != null)
			{
			  var socket = (Socket)socketField.GetValue(session);

			  if (socket != null)
			  {
				_log.Debug($"Socket state: Connected = {socket.Connected}, Blocking = {socket.Blocking}, Available = {socket.Available}, LocalEndPoint = {socket.LocalEndPoint}, RemoteEndPoint = {socket.RemoteEndPoint}");

				_log.Debug("Set _socket to null");

				try
				{
				  socket.Dispose();
				}
				catch (Exception ex)
				{
				  _log.Debug("Exception disposing _socket", ex);
				}

				socketField.SetValue(session, null);
			  }
			  else
			  {
				_log.Debug("_socket was null");
			  }
			}

			var messageListenerCompletedField = session.GetType().GetField("_messageListenerCompleted", BindingFlags.NonPublic | BindingFlags.Instance);

			var messageListenerCompleted = (EventWaitHandle)messageListenerCompletedField.GetValue(session);

			if (messageListenerCompleted != null)
			{
			  var waitHandleSet = messageListenerCompleted.WaitOne(0);

			  _log.Debug($"_messageListenerCompleted was set = {waitHandleSet}");

			  if (!waitHandleSet)
			  {
				_log.Debug($"Calling Set()");
				messageListenerCompleted.Set();
			  }
			}
			else
			{
			  _log.Debug("_messageListenerCompleted was null");
			}
		  }
		  else
		  {
			_log.Debug("Session was null");
		  }
		}
	  }
	  catch (Exception ex)
	  {
		_log.Debug($"Exception in Timer event handler", ex);
	  }
	};

	timer.Start();

	_client.Dispose();

	_log.Info("Disposed _client");
  });
}

A typical log when it fails to disconnect:

image

But it usually disconnects just fine, not disconnecting is the exception. The ratios are 20 failures for 700 successful disconnects in about 12 hours.

@JulianRooze
Copy link

I should add that it happens much more often that _socket is already null when disconnecting hangs, and only _messageListenerCompleted.Set() wasn't called yet. So the log usually says this when the disconnect fails:

image

So the _socket still being connected is the exception.

@bloudraak
Copy link

I have the same issue using .NET Core 2.0 on macOS Sierra 10.12.6 (16G1212)

using (var client = new SshClient(host, username, password))
                {

                    client.Connect();
                    var command = client.CreateCommand("df -H");
                    var result = command.Execute();
                    Console.WriteLine(result);
                    client.Disconnect();
                }

@neseih
Copy link

neseih commented Jun 5, 2018

Just ran into the same issue on Ubuntu 16.04, dotnet core 2.1

@jolosimeon
Copy link

jolosimeon commented Jun 25, 2018

Ran into the same problem on macOS High Sierra, 10.13.5, dotnet core version 2.1.300
It fails to disconnect and also hangs at the end of "using (var client = new SftpClient(...))"

@drieseng
Copy link
Member

What version of SSH.NET are you using?
Can you easily reproduce this issue, or does it only happen on some occasion?

@JulianRooze
Copy link

@drieseng We use the latest (2016.1.0). From our logs, in about 4% of cases it hasn't set the waithandle (see our workaround code above) so it would wait indefinitely. Ever since upgrading to .NET Core 2.1 we haven't seen the cases where the socket was still connected, so I think that's solved by that upgrade, but it was really rare before so it might be a coincidence. I think the occurrence of the issue is more or less random and probably a timing thing.

@HalfLegend
Copy link

@drieseng Any update?
It could be easily reproduced on mac os. Just open a ssh client and close it

@cravall
Copy link

cravall commented Jul 19, 2018

@drieseng - Getting it on macOS High Sierra dotnet core v2.1.301. Using the latest nuget package 2016.1.0. Easily reproducible... just open a connection and try to call Disconnect.

@drieseng
Copy link
Member

I don't have a mac :(
Can you reproduce it on Linux?

@cravall
Copy link

cravall commented Jul 19, 2018

Just tried to reproduce it in docker and couldn't... which I guess is a good thing because that's what we're using outside of our dev environments.

Still, would be good to have it fixed for dev purposes. I'll see if I can have a crack at fixing but honestly, I wouldn't know where to start 😄

@cravall
Copy link

cravall commented Jul 19, 2018

So when I slowly step through the code it doesn't hang... I changed all the bunch of DiagnosticAbstraction.Log to simple Console.WriteLine and got the following.

This is when it does hang:
screen shot 2018-07-20 at 10 02 25 am

This is when it does NOT hang:
screen shot 2018-07-20 at 10 00 42 am

@cravall
Copy link

cravall commented Jul 19, 2018

And this is from the docker container (which doesn't hang):
screen shot 2018-07-20 at 10 47 07 am

@Spica92
Copy link

Spica92 commented Jul 23, 2018

Hi,
I need some help :
I can connect correctly but any command sent, hang indefinitely. I tried access via putty and it works fine.
I tried to set a sleep between connection and command but result is the same.
If anyone has a clue it would be nice.
Renci.Ssh.net version 1.0.1.0
Thanks
KeyboardInteractiveAuthenticationMethod kauth = new KeyboardInteractiveAuthenticationMethod("integrator"); kauth.AuthenticationPrompt += new EventHandler<Renci.SshNet.Common.AuthenticationPromptEventArgs>(HandleKeyEvent); PasswordAuthenticationMethod pauth = new PasswordAuthenticationMethod("integrator", "myPassword"); ConnectionInfo connectionInfo = new ConnectionInfo("10.2.112.00", 22, "integrator", pauth, kauth); SshClient sshClient = new SshClient(connectionInfo); SshCommand cmd = sshClient.CreateCommand($"xFeedback register /Status/Audio"); string s =cmd.Execute(); // <-- hangs here for ever sshClient.Disconnect();

it fails here in SshCommand.cs
private void WaitOnHandle(WaitHandle waitHandle)
{
var waitHandles = new[]
{
_sessionErrorOccuredWaitHandle,
waitHandle
};

        switch (WaitHandle.WaitAny(waitHandles, CommandTimeout))  **// WaitHandle nerver raised**
        {
            case 0:
                throw _exception;
            case WaitHandle.WaitTimeout:
                throw new SshOperationTimeoutException(string.Format(CultureInfo.CurrentCulture, "Command '{0}' has timed out.", CommandText));
        }
    }

@drieseng
Copy link
Member

@Spica92 Please create a separate issue for this problem. Include as much details as possible (OS, SSH server, exact version of SSH.NET, …).

@drieseng
Copy link
Member

drieseng commented Jul 23, 2018

@cvallance From your traces, it looks as if .NET Core doesn't break out of the Socket.Select(IList checkRead, IList checkWrite, IList checkError, int microSeconds) call in Session.MessageListener() when the socket is disposed. I'll try to have a closer look tomorrow.

You can always add some CWLs before and after the Socket.Select(IList checkRead, IList checkWrite, IList checkError, int microSeconds) call, and in the finally block.

@drieseng drieseng self-assigned this Jul 25, 2018
@drieseng
Copy link
Member

I've been able to reproduce this issue on Linux.
It appears to be a bug (regression?) in .NET Core 2.1.
Filed as https://github.com/dotnet/corefx/issues/31368

@Calvfre
Copy link

Calvfre commented Aug 10, 2018

Thanks for identifying the bug.
Until it is fixed, is there a suggested / preferred work around we should use?

@JulianRooze
Copy link

@Calvfre the workaround code I posted above has been working well for us for a few months now. Nasty reflection, but it works.

@mc-denisov
Copy link

.NET say use Shutdown() to prevent endless hang: dotnet/corefx#26898

@Calvfre
Copy link

Calvfre commented Aug 13, 2018

@mc-denisov If I am looking at this correctly,
In the session class, shutdown is being used.
"_socket.Shutdown(SocketShutdown.Send);"

@drieseng
Copy link
Member

@jpmdl Thanks for the feedback!

@xiaohui-hiwintech
Copy link

@drieseng I have the same issue when disconnecting the sftpclient, I'm so happy to see this. so when this package will upload to NuGet?

@vlflorian
Copy link

Just encountered the same problem. Fix in 2020.0.0-beta1 works. Any idea when we can expect a stable 2020 release?

@Visavant
Copy link

@drieseng I have the same issue when disconnecting the sftpclient, I'm so happy to see this. so when this package will upload to NuGet?

Checkbox "Include prerelease" in NuGet's Browse within the package manager.

@TheTrigger
Copy link

Still buggy for me, handling ~1500 requests (sync too) this library goes in a deadlock with any version. I had to remove it and run ssh Linux command... :(

@darkoperator
Copy link

darkoperator commented Jan 30, 2021 via email

@drieseng
Copy link
Member

drieseng commented Feb 1, 2021

If someone can provide the means for me to reproduce this issue, don't hesitate to reopen this issue (or create a new one).

@TheTrigger
Copy link

Even with last week update

Sent from my iPhone
On Jan 29, 2021, at 11:19 AM, ƒabio @.***> wrote:  Still buggy for me, handling ~1500 requests (sync too) this library goes in a deadlock with any version. I had to remove it and run ssh Linux command... :( — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

I'll try in a few days, thank you

@TheTrigger
Copy link

unfortunately, it still doesn't work, even with 2020.0.1

@darkoperator
Copy link

darkoperator commented Feb 4, 2021 via email

@TheTrigger
Copy link

Sure, addresses size could be ~1000, sometimes it locks even with a few elements

foreach (var address in addresses)
{
	if (cancellationToken.IsCancellationRequested)
		break;

	var task = BackupAsync(address, cancellationToken);
	tasks.Add(task);
}

await Task.WhenAll(tasks);

// code never reached here ...
public class RenciWrapper : ISshWrapper
{
	private const int DefaultPort = 22;

	private readonly SshClient _client;

	public RenciWrapper(string address, string user, string password)
	{
		var connectionInfo = new ConnectionInfo(
						host: address,
						port: DefaultPort,
						username: user,
						new AuthenticationMethod[]
						{
							new PasswordAuthenticationMethod(user, password),
						}
					)
		{
			Timeout = new TimeSpan(0, 0, 6),
			RetryAttempts = 1,
		};

		_client = new SshClient(connectionInfo);
	}

	public Task<string> ExecuteCommandAsync(CancellationToken cancellationToken = default)
	{
		try
		{
			_client.Connect(); // not async :(

			var command = _client.CreateCommand("export");
			var response = command.Execute(); // I've also used https://github.com/JohnTheGr8/Renci.SshNet.Async

			return Task.FromResult(response);
		}
		finally
		{
			_client?.Dispose();
		}
	}

	public void Dispose()
	{
		_client?.Dispose();
	}
}

No problems with my async SshCliWrapper

@AraHaan
Copy link

AraHaan commented Feb 4, 2021

Hm, @uffebjorklund
I try use your SSH.NET.Fork package and this issue not solved, connection stuck on moment dispose\disconnect client.

using (var client = new SshClient(Connection(dhcpServer, "root")))
{
    client.Connect();
    var command = client.CreateCommand("df -H").Execute();

    // 1. Disconnect only
    client.Disconnect();

    // 2. Dispose with Disconnect
    client.Dispose();
    client.Disconnect();

    // 3. Dispose method by @JulianRooze
    Dispose(client);
    client.Disconnect();
}

No one of these ways does not work.
May be i try not correctly, but i run 5 tasks, each by 30 sec, and my server is hung after 6 hours due to lack of memory.. This is very critical for work.

Does anyone have a working solution or another library?

I think you should use disconnect BEFORE dispose.

@TheTrigger
Copy link

TheTrigger commented Feb 4, 2021

Still no luck

try
{
	_client.Connect();

	var command = _client.CreateCommand("export");
	var response = command.Execute();

	return Task.FromResult(response);
}
finally
{
	_client.Disconnect();
	_client?.Dispose();
}

I have some unreachable devices. unsuccessful connections could cause locks?

@Dimencia
Copy link

Dimencia commented Aug 12, 2021

Can confirm on Linux that 2020.0.0-beta1 works to avoid this issue, while 2020.0.1 has this problem

This is also likely related to a problem that, on Linux, _client.IsConnected always returns true (after initial connection). Even sending a restart command and confirming the client disconnects, it still returns true. I was using it to confirm a client disconnected after attempting to restart it, and had to write separate logic to attempt repeated connections until one fails

EDIT: Nope, beta doesn't work reliably either, it's just intermittent on both of them so it seemed like it was working properly.

@kumudumushi-dev
Copy link

kumudumushi-dev commented Oct 18, 2021

Hi @drieseng I'm getting issue also on my SFTP setup using the SSH.NET latest version using .NET 5.0. It works fine until I got this issue.

Socket read operation has timed out after 30000 milliseconds.

This is my current code.

using (SftpClient sftpClient = new SftpClient(await getSftpConnection(export.Schedule.Host,
                                                  export.Schedule.Username,
                                                  export.Schedule.Password,
                                                  export.Schedule.KeyFile,
                                                  export.Schedule.Parapharase,
                                                  export.Schedule.LogonType, log)))

        {
            try
            {
                if (!sftpClient.IsConnected)
                {
                    log.LogInformation($"Connecting Sftp....");
                    do
                    {
                        try
                        {
                            sftpClient.Connect();
                        }
                        catch (Renci.SshNet.Common.SshConnectionException e)
                        {
                            log.LogInformation($"Connection error: {e.Message}");
                        }
                    } while (!sftpClient.IsConnected);

                    log.LogInformation($"Done connecting Sftp");
                }
                        //Send file per devicename
                        foreach (var _perDevice in export.ExportTablePerDevice)
                        {
                            log.LogInformation($"Device {_perDevice.DeviceName}");
                            try
                            {
                                var csv = export.Schedule.Layout.Equals("LAYOUT 05") ? ReportHelper.ConvertToDat(export.ExportTable) :
                                    ReportHelper.ConvertToCsv(_perDevice.ExportTable, export.Schedule.Layout, _perDevice.TopUserHeader);

                                var stream = ZipArchiveExtensions.GenerateStreamFromString(csv);

                                string filename = "";
                                if (export.IsExportDevices)
                                {
                                    filename = _perDevice.DeviceName.Replace(@"/", "_") + "_" + DateTime.Now.Ticks + "." + export.Schedule.ExportFormat.ToLower() ?? "csv";
                                }
                                else
                                {
                                        filename = (!String.IsNullOrEmpty(export.Schedule.FileName) ? export.Schedule.FileName.Replace(@"/", "_").Replace("#tick", DateTime.Now.Ticks.ToString()) :
                                           _perDevice.DeviceName.Replace(@"/", "_") + DateTime.Now.Ticks) + "." + export.Schedule.ExportFormat.ToLower() ?? "csv";
                                }
                                var file = $"/{export.Schedule.FilePath}/{filename}";
                                await sftpClient.UploadAsync(stream, file);
                            }
                            catch (Exception ex)
                            {
                                sftpClient.Disconnect();
                                sftpClient.Dispose();
                                log.LogInformation($"Device error: {ex.Message}");
                            }
                        }
                sftpClient.Disconnect();
                response = new StatusCodeResponse()
                {
                    Code = HttpStatusCode.OK.ToString(),
                    HttpStatus = (int)HttpStatusCode.OK,
                    Message = $"Successfully sent export {export.Schedule.Name} via sftp server"
                };

            }
            catch (Exception ex)
            {
                sftpClient.Disconnect();
                log.LogInformation($"An exception has been caught {ex.Message}");
                response = new StatusCodeResponse()
                {
                    HttpStatus = (int)HttpStatusCode.InternalServerError,
                    Code = HttpStatusCode.InternalServerError.ToString(),
                    Message = $"Error sending export {export.Schedule.Name} via sftp server: {ex.Message}"
                };
            }
        }

@aberbegall
Copy link

Is this issue fixed with 2020.0.1 and .NET 5.0 or shall I simply avoid calling Disconnect() method and just Dispose() the connection?
Asking because I don't see that Shutdown(SocketShutdown.Both) was finally implemented into any latest SSH.NET release.
Thanks

@ilqvya
Copy link

ilqvya commented Jun 13, 2022

Better use the Process.Start to call the native to windows "ssh" program and pass args to it. The project is a mess of multithreading issues

@aberbegall
Copy link

aberbegall commented Jun 13, 2022

@effolkronium thanks for the tip, but my code needs to support Win Server 2012 and 2016 too where no windows native ssh exists (I would need to install PuTTY Plink or similar 3rd party)

@oguzhantopcu
Copy link

oguzhantopcu commented Jun 13, 2022 via email

@aberbegall
Copy link

@oguzhantopcu Is this the update your fork contains or is there anything else to consider?
Thanks

@oguzhantopcu
Copy link

Yeah, that one line code fixes that hang problem. My app makes thousands of sftp connections everyday without any problem.

@erik-wramner
Copy link

Then perhaps we could get that line into the official version?

@oguzhantopcu
Copy link

is there anyone that still faces this issue? i want to ditch my custom code and use the original one. maybe it is fixed in .net core side.

@drieseng btw why we are not using SocketShutdown.Both instead of SocketShutdown.Send?

as like here OctopusDeploy@7e8bf58

@Kalyxt
Copy link

Kalyxt commented Feb 1, 2024

Yes its still issue for me, If I connect to remote client, and I lost connection, it hangs on Disconnect.

@oguzhantopcu
Copy link

Yes its still issue for me, If I connect to remote client, and I lost connection, it hangs on Disconnect.

.net core version?

@Kalyxt
Copy link

Kalyxt commented Feb 1, 2024

I am trying in on .NET Framework 4.7.2 project.

I can easily test it in this order:

  • Create SSH connection to remote machine
  • Sudo reboot on remote machine
  • It hangs in Disconnect calling from Dispose at _ = _messageListenerCompleted.WaitOne();
public void Disconnect()
{
    DiagnosticAbstraction.Log(string.Format("[{0}] Disconnecting session.", ToHex(SessionId)));

    // send SSH_MSG_DISCONNECT message, clear socket read buffer and dispose it
    Disconnect(DisconnectReason.ByApplication, "Connection terminated by the client.");

    // at this point, we are sure that the listener thread will stop as we've
    // disconnected the socket, so lets wait until the message listener thread
    // has completed
    if (_messageListenerCompleted != null)
    {
        _ = _messageListenerCompleted.WaitOne();
    }
}

I fixed it by adding _ = _messageListenerCompleted.WaitOne(1000);
and _ = _messageListenerCompleted?.Set(); in MessageListener.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests