Scrape logs out of containers when integration tests fail #477

tgeoghegan · 2022-09-06T23:53:08Z

When tests run against interop containers fail, we now copy logs out of them so that they can be rolled up by GitHub Actions and attached to failing test runs as an artifact.

Part of #455

tgeoghegan · 2022-09-07T01:57:25Z

This seems to be working! I made a change to force test failures. Here's a failing run: https://github.com/divviup/janus/actions/runs/3004476197

We get a link to download the logs artifact. Its contents:

./janus_daphne
./janus_daphne/leader-d2d45a824ddd18673c402f61ed2d71fe44c04eed627d9a084720dfd58374d7bc
./janus_daphne/leader-d2d45a824ddd18673c402f61ed2d71fe44c04eed627d9a084720dfd58374d7bc/postgres_stdout.log
./janus_daphne/leader-d2d45a824ddd18673c402f61ed2d71fe44c04eed627d9a084720dfd58374d7bc/postgres_stderr.log
./janus_daphne/leader-d2d45a824ddd18673c402f61ed2d71fe44c04eed627d9a084720dfd58374d7bc/aggregator_stdout.log
./janus_daphne/leader-d2d45a824ddd18673c402f61ed2d71fe44c04eed627d9a084720dfd58374d7bc/aggregator_stderr.log
./janus_daphne/helper-9104509d90cc3fedea7dcbefb98c2ecbf348073163f15d24458673fe581d25f7
./janus_daphne/helper-9104509d90cc3fedea7dcbefb98c2ecbf348073163f15d24458673fe581d25f7/stdout.log
./janus_daphne/helper-9104509d90cc3fedea7dcbefb98c2ecbf348073163f15d24458673fe581d25f7/stderr.log
./daphne_janus
./daphne_janus/leader-ce359b3f0b57462955a524afe420d06ccaef55fe98f6cbf840b6a42b42dbf285
./daphne_janus/leader-ce359b3f0b57462955a524afe420d06ccaef55fe98f6cbf840b6a42b42dbf285/stdout.log
./daphne_janus/leader-ce359b3f0b57462955a524afe420d06ccaef55fe98f6cbf840b6a42b42dbf285/stderr.log
./daphne_janus/helper-e3dd356a2babf57492b8139670885a4201bca6ce030f2b8625a7dd432d82bef7
./daphne_janus/helper-e3dd356a2babf57492b8139670885a4201bca6ce030f2b8625a7dd432d82bef7/postgres_stdout.log
./daphne_janus/helper-e3dd356a2babf57492b8139670885a4201bca6ce030f2b8625a7dd432d82bef7/postgres_stderr.log
./daphne_janus/helper-e3dd356a2babf57492b8139670885a4201bca6ce030f2b8625a7dd432d82bef7/aggregator_stdout.log
./daphne_janus/helper-e3dd356a2babf57492b8139670885a4201bca6ce030f2b8625a7dd432d82bef7/aggregator_stderr.log

The top level path is the name of the failing test (or rather, a label chosen by the test and passed into the log fetcher function). Then we create a directory named after the role being played and the Docker container ID, inside of which are the stdout and stderrs of the container's processes.

Make the function return `Result<(), anyhow::Error>` so that instead of panicking and unwinding the whole test, we can do some work -- such as fetching container logs -- before all the `testcontainer` objects get dropped and any useful information is lost.

Adds plumbing for extracting logs from Janus and Daphne containers on test failures. We introduce trait `CopyLogs` and implement it on both. For Janus, we recursively copy out the entire `/logs` directory from a container. For Daphne, we run `docker logs` against the container. Extracted logs are written to a directory on the host. If the `JANUS_E2E_LOGS_PATH` environment variable is set, its value is used (this will be used by CI so the log files can be uploaded as GitHub Actions artifacts). If the env var is unset, a tempdir is created and its path is logged.

Logs emitted by integration tests should get attached to actions runs for analysis now.

divergentdave · 2022-09-07T18:26:32Z

integration_tests/tests/common/mod.rs

+    leader: &Leader,
+    mut test: Test,
+) {
+    let result = test().await;


We should probably use std::panic::catch_unwind() here, both to ensure we emit logs on any panics we miss, and to make writing tests in this context more ergonomic. (We can use std::panic::panic_any() to propagate resulting outer errors)

I tried doing that, but std::panic::catch_unwind can't be passed an async closure. The async equivalent is futures::future::FutureExt::catch_unwind, but it requires UnwindSafe, which the closure we want to test is decidedly not, and so I backed off from this approach and instead made submit_measurements_and_verify_aggregate fallible.

Ah right, of course. There are future wrappers that apply this to poll, FWIW: https://docs.rs/futures/latest/futures/future/struct.CatchUnwind.html

tgeoghegan force-pushed the timg/scrape-container-logs-integration-tests branch from 74ba1cf to ceb5690 Compare September 7, 2022 01:30

tgeoghegan force-pushed the timg/scrape-container-logs-integration-tests branch from ceb5690 to 891ae0a Compare September 7, 2022 16:15

tgeoghegan force-pushed the timg/scrape-container-logs-integration-tests branch from 891ae0a to 83c56e3 Compare September 7, 2022 16:16

tgeoghegan changed the title ~~Timg/scrape container logs integration tests~~ Scrape logs out of containers when integration tests fail Sep 7, 2022

tgeoghegan marked this pull request as ready for review September 7, 2022 16:18

tgeoghegan requested a review from a team as a code owner September 7, 2022 16:18

tgeoghegan added 2 commits September 7, 2022 09:37

Upload logs on integration test failures

94cee06

Logs emitted by integration tests should get attached to actions runs for analysis now.

tgeoghegan force-pushed the timg/scrape-container-logs-integration-tests branch from 83c56e3 to 94cee06 Compare September 7, 2022 16:38

divergentdave approved these changes Sep 7, 2022

View reviewed changes

tgeoghegan merged commit f838e11 into main Sep 7, 2022

tgeoghegan deleted the timg/scrape-container-logs-integration-tests branch October 12, 2022 19:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scrape logs out of containers when integration tests fail #477

Scrape logs out of containers when integration tests fail #477

tgeoghegan commented Sep 6, 2022 •

edited

Loading

tgeoghegan commented Sep 7, 2022

divergentdave Sep 7, 2022

tgeoghegan Sep 7, 2022

divergentdave Sep 7, 2022

Scrape logs out of containers when integration tests fail #477

Scrape logs out of containers when integration tests fail #477

Conversation

tgeoghegan commented Sep 6, 2022 • edited Loading

tgeoghegan commented Sep 7, 2022

divergentdave Sep 7, 2022

Choose a reason for hiding this comment

tgeoghegan Sep 7, 2022

Choose a reason for hiding this comment

divergentdave Sep 7, 2022

Choose a reason for hiding this comment

tgeoghegan commented Sep 6, 2022 •

edited

Loading