Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CopyFilesOverSSHV0 unbearably slow #13156

Closed
tavgordon opened this issue Jun 19, 2020 · 15 comments · Fixed by #13187 or #13312
Closed

CopyFilesOverSSHV0 unbearably slow #13156

tavgordon opened this issue Jun 19, 2020 · 15 comments · Fixed by #13187 or #13312
Assignees
Labels
Area: ABTT Akvelon Build Tasks Team area of work bug stale Task: CopyFilesOverSSH

Comments

@tavgordon
Copy link

Type: Bug

Enter Task Name: CopyFilesOverSSHV0

Environment

  • Server - Azure Pipelines

    • If using Azure Pipelines, provide the account name, team project name, build definition name/build number:
      • Account Name: numberjak
      • Team Project: Magpie
      • Release Definition Name: DigitalOcean
      • Release Name: Release-4
  • Agent - Hosted or Private:

    • If using Hosted agent, provide agent queue name: WIN-DF00C5S5J33 (this is the machine name. hope that's what you mean)

Issue Description

My release pipeline copies build artifacts to a remote server on DigitalOcean using the CopyFilesOverSSHV0 task. The files are around 500MB total in size. However the process of copying these files to the remote server consistently takes around 43 minutes, which is obviously ridiculously slow.

Running some benchmarks on the remote server the files are being sent to, I can see the problem is not on DigitalOcean server:

I/O Benchmarks:

root@blahblah:~# dd if=/dev/zero of=benchmark bs=1M count=2048 conv=fdatasync
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 2.83064 s, 759 MB/s
root@blahblah:~# dd if=/dev/zero of=benchmark bs=64K count=32K conv=fdatasync
32768+0 records in
32768+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 3.3467 s, 642 MB/s

Download speed benchmark:

root@blahblah:~# wget --output-document=/dev/null http://speedtest.wdc01.softlayer.com/downloads/test500.zip
--2020-06-19 16:15:09--  http://speedtest.wdc01.softlayer.com/downloads/test500.zip
Resolving speedtest.wdc01.softlayer.com (speedtest.wdc01.softlayer.com)... 158.85.230.20, 2607:f0d0:3006:6c::4
Connecting to speedtest.wdc01.softlayer.com (speedtest.wdc01.softlayer.com)|158.85.230.20|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 524288000 (500M) [application/zip]
Saving to: ‘/dev/null’

/dev/null                                         100%[=============================================================================================================>] 500.00M  46.6MB/s    in 30s     

2020-06-19 16:15:48 (16.7 MB/s) - ‘/dev/null’ saved [524288000/524288000]

Task logs

ReleaseLogs_5.zip

@github-actions github-actions bot added Area: ABTT Akvelon Build Tasks Team area of work bug labels Jun 19, 2020
@sdobrodeev sdobrodeev self-assigned this Jun 26, 2020
@sdobrodeev sdobrodeev linked a pull request Jun 26, 2020 that will close this issue
2 tasks
@sdobrodeev sdobrodeev added the awaiting deployment Related changes are waiting for deployment to be completed label Jul 7, 2020
@alexander-smolyakov
Copy link
Contributor

Reopen this, since the fix is waiting to be deployed. We will let you know once the fix will be published.

@sdobrodeev sdobrodeev removed the awaiting deployment Related changes are waiting for deployment to be completed label Jul 15, 2020
@sdobrodeev sdobrodeev linked a pull request Jul 20, 2020 that will close this issue
2 tasks
@sdobrodeev sdobrodeev added the awaiting deployment Related changes are waiting for deployment to be completed label Aug 4, 2020
@sdobrodeev sdobrodeev removed the awaiting deployment Related changes are waiting for deployment to be completed label Sep 2, 2020
@ahsan-cc
Copy link

Any update on this guys?

@anatolybolshakov
Copy link
Contributor

@ahsan-cc we rolled out dependencies update for CopyFilesOverSSH task - so as we see now performance increased for this task. Do you still face with performance issues?

@ahsan-cc
Copy link

Ah yes. It took me almost 30 mins to copy around 50mb of data to DO server.

@anatolybolshakov
Copy link
Contributor

@ahsan-cc do you use on-prem version of Azure DevOps? Could you please share what task version do you use?

@ahsan-cc
Copy link

Sorry for the late response. Yes I am using the available task on Azure DevOps. Task version and name is as followed.

CopyFilesOverSSH@0

@anatolybolshakov
Copy link
Contributor

@ahsan-cc could you please share your exact task version? You can find it in logs like below (here's example for Bash task - v3.182.0):

image

@ahsan-cc
Copy link

Here it is.

2021-03-11T14:08:34.3544624Z ##[section]Starting: Copy to Valus Staging Server
2021-03-11T14:08:34.3811058Z ==============================================================================
2021-03-11T14:08:34.3811430Z Task         : Copy files over SSH
2021-03-11T14:08:34.3813280Z Description  : Copy files or build artifacts to a remote machine over SSH
2021-03-11T14:08:34.3813603Z Version      : 0.180.0
2021-03-11T14:08:34.3813893Z Author       : Microsoft Corporation
2021-03-11T14:08:34.3814528Z Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/deploy/copy-files-over-ssh

@ahsan-cc
Copy link

Hey, I just wanted to check if we have any update on this?

@anatolybolshakov
Copy link
Contributor

Hi @ahsan-cc I reopen it at the moment to investigate it currently.

@cbrennan45
Copy link

Is there an update on this issue? We are also experiencing this but its intermittent. Run it once and it finishes in seconds. The next run takes 15 mins. When this happens i do notice this process maybe having something to do with it??
00:00:03 /home/localadmin/azagent/externals/node10/bin/node /home/localadmin/azagent/_work/_tasks/CopyFilesOverSSH_67cec91b-0351-4c2f-8465-d74b3d2a2d96/0.180.0/copyfilesoverssh.js

In the log i always see this message during the long runs. See the gap in timestamps as well.

[2021-04-11 11:04:32Z INFO JobServerQueue] Stop aggressive process web console line queue.
[2021-04-11 11:21:47Z INFO ProcessInvokerWrapper] STDOUT/STDERR stream read finished.

Any ideas?
thanks!

@anatolybolshakov
Copy link
Contributor

@cbrennan45 does it happen for the same file size? Intermittent nature of issue could point to some network issues between agent and target machine.

@cbrennan45
Copy link

cbrennan45 commented Apr 21, 2021

@anatolybolshakov we copy mostly war files avg 40MB-60MB and it seems to not matter as far as size. I run top and do not see any thing suspect except the node process which seems to be running at the time of the copyoverssh task. Other than that I was thinking something over the network because it our IVT environment it is not happening.

One other suspect thing is this msg when it happens. Between that last msg there could be a 15-20 min delay until the next update in the log and then it finally finishes.

[2021-04-11 11:04:32Z INFO JobServerQueue] Stop aggressive process web console line queue.

Thanks!

I wanted to update on the issue. the manual scp command has the same behavior so we ruled out the module in azure as the problem.

@alexander-smolyakov alexander-smolyakov self-assigned this May 4, 2021
@cbrennan45
Copy link

@anatolybolshakov just FYI our issue was a firewall/load balancer issue causing the slowness!
Thanks for inquiring

@github-actions
Copy link

This issue is stale because it has been open for 180 days with no activity. Remove the stale label or comment on the issue otherwise this will be closed in 5 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment