-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ci): increase test timeouts #656 #713
Conversation
4c96714
to
eeb731a
Compare
.github/workflows/ci.yml
Outdated
@@ -34,7 +34,7 @@ jobs: | |||
# experimental: true | |||
|
|||
steps: | |||
# FIXME: These do not work on mac OS as of 2020-12-09 | |||
# FIzXME: These do not work on mac OS as of 2020-12-09 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo creep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/facepalm cheers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Bumping up the test timesouts to a full hour becauase under heavy load the GHA runner seems to be extremely slow, meaning that the fabric tests can take longer than half an hour each despite the fact that these usually take about 5 minutes or less even on the slow GHA runners. Fixes hyperledger-cacti#656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
…ledger-cacti#656 Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. This change makes it so that the pullImage(...) method of the Containers utility class will now - by default - retry 6 times if the docker image pulling has failed. The internval between retries is increasing exponentially (power of two) starting from one second as the delay then proceeding to be 2^6 seconds for the final retry (which if also fails then an AbortError is thrown by the underlying pRetry library that is powering the retry mechanism.) For reference, here is a randomly failed CI test execution where the logs show that DockerHub is randomly in- accessible over the network and that's another thing that makes our tests flaky, hence this commit to fix this. https://github.com/hyperledger/cactus/runs/2178802580?check_suite_focus=true#step:8:2448 In case that link goes dead in the future, here's also the actual logs: not ok 60 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts # time=25389.665ms --- env: TS_NODE_COMPILER_OPTIONS: '{"jsx":"react"}' file: packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts timeout: 1800000 command: /opt/hostedtoolcache/node/12.13.0/x64/bin/node args: - -r - /home/runner/work/cactus/cactus/node_modules/ts-node/register/index.js - --max-old-space-size=4096 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts stdio: - 0 - pipe - 2 cwd: /home/runner/work/cactus/cactus exitCode: 1 ... { # NodeJS API server + Rust plugin work together [2021-03-23T20:45:51.458Z] INFO (VaultTestServer): Created VaultTestServer OK. Image FQN: vault:1.6.1 not ok 1 Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) --- operator: error at: bound (/home/runner/work/cactus/cactus/node_modules/onetime/index.js:30:12) stack: |- Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) at /home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:301:17 at IncomingMessage.<anonymous> (/home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:328:9) at IncomingMessage.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1183:12) at processTicksAndRejections (internal/process/task_queues.js:80:21) ... Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) } Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious error in the CI that can be seen at the bottom. Based off of the advice of a fellow internet user as seen here: https://stackoverflow.com/a/61789467 No idea if this will fix the particular error that we are trying to fix or not, but we have to try. The underlying issue seems to be a bug in npm itself, but knowing that doesn't disappear the need to find a workaround so here we go... Error logs and link: ---------------------------- Link: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs: Run npm ci npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious issue with npm ci Based on a true story: https://stackoverflow.com/a/15483897 CI failure logs: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs ------ npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
…ger-cacti#656 This is yet another attempt at potentially fixing all the remaining CI flakes that only happen on the GitHub Action runners but never on developer machines. Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
Fixes #656
Signed-off-by: Peter Somogyvari peter.somogyvari@accenture.com