-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frequent bad error MAC
failures on ubuntu1604-arm64
#842
Comments
I think we should hand out badges for finding new errors. This one's a doozy so well done @Trott. This machine is in their Sunnyvale datacenter, I think that's their original one and we've had a bunch of problems in there. I'm going to shut this machine down and reprovision elsewhere. |
all done |
Unfortunately, that doesn't look like it fixed it. It's still happening. https://ci.nodejs.org/job/node-test-commit-arm/11659/nodes=ubuntu1604-arm64/console gyp FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.105.54/147.75.105.54:48564' is disconnected.
at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
at hudson.model.Build$BuildExecution.build(Build.java:206)
at hudson.model.Build$BuildExecution.doRun(Build.java:163)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
at hudson.model.Run.execute(Run.java:1728)
at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:405) https://ci.nodejs.org/job/node-test-commit-arm/11654/nodes=ubuntu1604-arm64/console FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.111.186/147.75.111.186:56790' is disconnected.
at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
at hudson.model.Build$BuildExecution.build(Build.java:206)
at hudson.model.Build$BuildExecution.doRun(Build.java:163)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
at hudson.model.Run.execute(Run.java:1728)
at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:405) |
OK, that's pretty strange, entirely new host in a different DC so must be the software stack. What I've done on this host (only) is install the Oracle JDK 8 and uninstalled the OpenJDK 8. Cross fingers I guess. |
Here's the most recent one. Not sure if this is on a machine we hope is fixed or a machine that we will apply the fix to? https://ci.nodejs.org/job/node-test-commit-arm/11730/nodes=ubuntu1604-arm64/console FATAL: command execution failed
javax.crypto.BadPaddingException: bad record MAC
at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:238)
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:974)
Caused: javax.net.ssl.SSLException: bad record MAC
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:981)
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:347)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 147.75.111.186/147.75.111.186:50334' is disconnected.
at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
at com.sun.proxy.$Proxy91.isAlive(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
at org.jenkinsci.plugins.conditionalbuildstep.BuilderChain.perform(BuilderChain.java:71)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
at org.jenkinsci.plugins.conditionalbuildstep.ConditionalBuilder.perform(ConditionalBuilder.java:134)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
at hudson.model.Build$BuildExecution.build(Build.java:206)
at hudson.model.Build$BuildExecution.doRun(Build.java:163)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
at hudson.model.Run.execute(Run.java:1728)
at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:405) |
@Trott has this cropped up recently? |
@refack Not that I've noticed. |
Similar issues showed up on ubuntu1604-arm64 again, this time it's in the test phase: https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14299/console
|
cc @rvagg |
Hah, that's pretty weird! afaik Java doesn't use OpenSSL so we're looking at something deeper here. @nodejs/crypto if you're looking for an interesting challenge, this might be for you. For this machine, on top of build 14299 these ones have failed with the same error: https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14307/ These ones have failed with different crypto errors: https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14300/
https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14284/console
The other Ubuntu 16.04 ARM64 machine in CI is much more green but isn't free of crypto failures, there's this one: https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14194/
And these two that are different again: https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14316/
https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/14164/
And I can't find anything like this in the CentOS 7 ARM64 builds, so it's limited to Ubuntu 16.04. I don't know to interpret that cause it's the same OpenSSL being compiled on both. Compiler difference perhaps? We're on a gcc 4.8.5 for CentOS 7 and 5.4.0 for Ubuntu 16.04. Not sure where to take this next tbh. |
Can I get login the machine? I do not have the arm64 machine. |
yep @shigeki, root@147.75.74.174 has your github keys in it. That's the machine that throws these errors up most frequently. I was just in there running parallel/test-regress-GH-1531 in a loop and was getting failures roughly every 200 times. I even got an SSH authentication failure when I tried to log in to it the first time today .. I'm not sure if that's the same but it's certainly fishy. |
Thanks. I can login now. I will investigate the issue. |
I built openssl-1.0.2n and node and make tests. I also made thousands tls connections between node tls server and client but no errors were found so that I could not reproduce the errors. As far as checked the size of openssl assembler files between node and openssl-1.0.2n, they are the same size as below.
There may be other reasons to cause this issues. |
@rvagg can you confirm the IDs of these systems under test that are throwing errors? I will double check the firmware on those systems. The output of |
Also @rvagg feel free to spin up an additional machine just to loop |
I'll note this post which reports a similar set of issues to what @neemah reported re Heroku. |
These machines look to be doing better these days -- going to close out this issue. Please reopen if this requires further action |
https://ci.nodejs.org/job/node-test-commit-arm/11637/nodes=ubuntu1604-arm64/console
https://ci.nodejs.org/job/node-test-commit-arm/11642/nodes=ubuntu1604-arm64/console:
https://ci.nodejs.org/job/node-test-commit-arm/11634/nodes=ubuntu1604-arm64/console
The text was updated successfully, but these errors were encountered: