Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Almost all nodes are offline with "This node is offline because it uses an old slave.jar" #1759

Closed
richardlau opened this issue Apr 12, 2019 · 15 comments

Comments

@richardlau
Copy link
Member

From irc:

[20190412 02:10:10] <refack> We updated Jenkins. So it probably got picky
@rvagg
Copy link
Member

rvagg commented Apr 12, 2019

Yeah .. so the security update of Jenkins brought us onto a newer version, their LTS channel is a bit funky in that it makes us jump with a simple update, no opt-in. The update made our slave.jar's all out of date so I've manually been through all of them and updated slave.jar.

I need @nodejs/platform-aix @nodejs/platform-s390 @nodejs/platform-ppc folks to do these ones though, they're out of my pay grade.:

test-iinthecloud-ibmi72-ppc64_be-1
test-iinthecloud-ibmi72-ppc64_be-2
test-linuxonecc-rhel72-s390x-3
test-marist-zos13-s390x-1
test-marist-zos13-s390x-2
test-osuosl-aix61-ppc64_be-1
test-osuosl-aix61-ppc64_be-2
test-osuosl-aix61-ppc64_be-3
test-osuosl-ubuntu1404-ppc64_le-4

I've disabled https://ci.nodejs.org/job/node-test-commit-aix/ so that can be re-enabled when the AIX ones are online.

@sam-github
Copy link
Contributor

@rvagg I have access to those machines, and would like to help, but I need a hint! I have been learning a bit about ansible lately, but don't know anything about jenkinds

From where do I get the slave.jar? Can I just re-run ansible on them to update, or is this a manual process?

@sam-github
Copy link
Contributor

https://github.com/nodejs/build/blob/master/ansible/playbooks/jenkins/worker/upgrade-jar.yml and https://github.com/nodejs/build/blob/master/ansible/group_vars/infra.yml#L1 makes it look like ansible will do this for me, so we just need to rerun ansible on those hosts?

@mhdawson & @rvagg ^---- shall I do this? Though I only have access to the test machines, not release.

@refack
Copy link
Contributor

refack commented Apr 12, 2019

Some hints at https://github.com/nodejs/build/blob/master/doc/jenkins-guide.md

tl;dr:

  1. On the machines stop the Jenkins worker daemon
  2. cd /home/iojs && curl -L https://ci.nodejs.org/jnlpJars/slave.jar -o slave.jar
  3. chown iojs.iojs slave.jar (or the platform equivalant)
  4. Restart the Jenkins worker daemon
  5. Fun & Profit

@refack
Copy link
Contributor

refack commented Apr 12, 2019

Though I only have access to the test machines, not release.

It seems like the Jenkins version on ci-release is not rejecting old worker remoting jars, so it's less of an issue.

@mhdawson
Copy link
Member

Did somebody already do it for the aix machines? It looks like they are online

@mhdawson
Copy link
Member

I've updated the IBM platforms in the release CI so if we do a similar upgrade there we should be ok .

@sam-github
Copy link
Contributor

test-linuxonecc-rhel72-s390x-2
test-linuxonecc-rhel72-s390x-3
test-osuosl-ubuntu1404-ppc64_le-1
test-osuosl-ubuntu1404-ppc64_le-2
test-osuosl-ubuntu1404-ppc64_le-4``` are all up-to-date.

The two I updated, test-linuxonecc-rhel72-s390x-3 and test-osuosl-ubuntu1404-ppc64_le-4 are restarted and status says connected.

@mhdawson
Copy link
Member

@rvagg do the updates get applied automatically? I went looking for an issue with a heads up that an update to jenkins was going to be applied both in the build repo and in the security repo for some sort of heads up that there was going to be an update. We might want to be proactive in updating the agents or at least have people ready to act if the udpate requires and update.

@miladfarca
Copy link

@mhdawson @sam-github I've updated the jar file on test-marist-zos13-s390x-1 and test-marist-zos13-s390x-2. Left the file permissions as the older jar file.

@sam-github
Copy link
Contributor

test-osuosl-aix61-ppc64_be-1
test-osuosl-aix61-ppc64_be-2
test-osuosl-aix61-ppc64_be-3

are updated, restarted, and status pages claim they are connected.

@mhdawson
Copy link
Member

I've re-enabled AIX.

@sam-github
Copy link
Contributor

Everything in #1759 (comment) has been updated, and status is connected.

@refack
Copy link
Contributor

refack commented Apr 12, 2019

FTR this part of the list from https://ci.nodejs.org/computer/ sorted by the "remoting" version (last column).
Jenkins marks 3.27 as
image

S Name % Disk Usage Architecture Free Disk Space JVM Version Response Time Clock Difference Free Swap Space Free Temp Space Remoting Version  ↓  
  test-iinthecloud-ibmi72-ppc64_be-1 N/A   N/A N/A N/A N/A N/A N/A N/A  
  test-iinthecloud-ibmi72-ppc64_be-2 N/A   N/A N/A N/A N/A N/A N/A N/A  
  test-requireio-osx1010-x64-1 N/A   N/A N/A N/A N/A N/A N/A N/A  
  test-requireio_louiscntr-debian9-armv7l_pi2-1 N/A   N/A N/A N/A N/A N/A N/A N/A  
  infra-softlayer-ubuntu1404-x64-2 32.0 % Linux (amd64) 311.63 GB 1.8.0_171 1617ms In sync 959.10 MB 311.63 GB 3.27  
  test-azure_msft-win2012r2-x64-3 31.0 % Windows Server 2012 R2 (amd64) 87.22 GB 1.8.0_201 14456ms 3.5 sec behind 2.46 GB 87.22 GB 3.27  
  test-azure_msft-win2016-x64-2 29.0 % Windows Server 2016 (amd64) 89.73 GB 1.8.0_191 1603ms In sync 2.61 GB 89.73 GB 3.27  
  test-digitalocean-freebsd11-x64-1 14.0 % FreeBSD (amd64) 28.89 GB 1.8.0_192 14477ms 9.8 sec behind 1.86 GB 28.89 GB 3.27  
  test-digitalocean-ubuntu1404-x64-1 46.0 % Linux (amd64) 21.02 GB 1.8.0_201 14472ms 3.2 sec behind 1.99 GB 21.02 GB 3.27  
  test-digitalocean-ubuntu1804-x64-1 75.0 % Linux (amd64) 5.92 GB 1.8.0_191 14459ms 3.3 sec behind 0 B 5.92 GB 3.27  
  test-joyent-smartos17-x64-1 3.0 % SunOS (amd64) 97.11 GB 1.8.0_162-internal 503ms In sync 15.60 GB 97.11 GB 3.27  
  test-joyent-ubuntu1804-x64-1 96.0 % Linux (amd64) 317.16 MB 1.8.0_191 14437ms 3 sec behind 1.89 GB 317.16 MB 3.27  
  test-macstadium-macos10.10-x64-1 86.0 % Mac OS X (x86_64) 6.64 GB 10.0.1 14404ms 3 sec behind 991.00 MB 6.64 GB 3.27  
  test-macstadium-macos10.10-x64-2 64.0 % Mac OS X (x86_64) 17.78 GB 9.0.1 14433ms 3.2 sec behind 1.09 GB 17.78 GB 3.27  
  test-macstadium-macos10.11-x64-1 62.0 % Mac OS X (x86_64) 18.47 GB 10.0.1 491ms In sync 591.00 MB 18.47 GB 3.27  
  test-macstadium-macos10.11-x64-2 82.0 % Mac OS X (x86_64) 8.91 GB 10.0.1 867ms In sync 1.35 GB 8.91 GB 3.27  
  test-macstadium-macos10.12-x64-1 68.0 % Mac OS X (x86_64) 15.87 GB 10.0.1 1546ms In sync 837.00 MB 15.87 GB 3.27  
  test-macstadium-macos10.12-x64-2 53.0 % Mac OS X (x86_64) 22.98 GB 10.0.1 14427ms 3.2 sec behind 749.00 MB 22.98 GB 3.27  
  test-marist-zos13-s390x-1 11.0 % z/OS (s390x) 18.28 GB 1.8.0 844ms 3 min 50 sec ahead N/A 34.12 MB 3.27  
  test-marist-zos13-s390x-2 13.0 % z/OS (s390x) 17.98 GB 1.8.0 844ms 3 min 50 sec ahead N/A 33.87 MB 3.27  
  test-packetnet-ubuntu1604-arm64-2 20.0 % Linux (aarch64) 181.63 GB 1.8.0_191 14386ms 3 sec behind 2.32 GB 181.63 GB 3.27  
  test-packetnet-ubuntu1604-x64-1 17.0 % Linux (amd64) 815.15 GB 1.8.0_191 333ms In sync 1.82 GB 98.86 GB 3.27  
  test-packetnet-ubuntu1604-x64-2 14.0 % Linux (amd64) 847.83 GB 1.8.0_191 14414ms 3.2 sec behind 1.89 GB 125.34 GB 3.27  
  test-rackspace-freebsd10-x64-1 10.0 % FreeBSD (amd64) 30.71 GB 1.8.0_181 14411ms 3.2 sec behind N/A 30.71 GB 3.27  
  test-softlayer-ubuntu1404-x64-1 66.0 % Linux (amd64) 8.14 GB 1.8.0_201 1348ms In sync 1.92 GB 8.14 GB 3.27  
  master 35.0 % Linux (amd64) 204.40 GB 1.8.0_201 0ms In sync 0 B 204.40 GB 3.29  
  node-msft-cross-compiler-1 96.0 % Linux (amd64) 1.03 GB 1.8.0_191 14466ms 3.2 sec behind 0 B 1.03 GB 3.29  

@refack refack closed this as completed Apr 12, 2019
@rvagg
Copy link
Member

rvagg commented Apr 15, 2019

@mhdawson no, I didn't add an issue and didn't think it'd a big deal, just a regular security release as far as I could tell from the emails. I didn't expect it to be disruptive! But, it turns out that we need to pay closer attention to version numbers because there is only "LTS" and that jumps around a bit (I'm not sure they are using that term in the same way that most other projects do).
Sorry!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants