-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scrape logs out of containers when integration tests fail #477
Conversation
74ba1cf
to
ceb5690
Compare
This seems to be working! I made a change to force test failures. Here's a failing run: https://github.com/divviup/janus/actions/runs/3004476197 We get a link to download the logs artifact. Its contents:
The top level path is the name of the failing test (or rather, a label chosen by the test and passed into the log fetcher function). Then we create a directory named after the role being played and the Docker container ID, inside of which are the stdout and stderrs of the container's processes. |
ceb5690
to
891ae0a
Compare
Make the function return `Result<(), anyhow::Error>` so that instead of panicking and unwinding the whole test, we can do some work -- such as fetching container logs -- before all the `testcontainer` objects get dropped and any useful information is lost.
891ae0a
to
83c56e3
Compare
Adds plumbing for extracting logs from Janus and Daphne containers on test failures. We introduce trait `CopyLogs` and implement it on both. For Janus, we recursively copy out the entire `/logs` directory from a container. For Daphne, we run `docker logs` against the container. Extracted logs are written to a directory on the host. If the `JANUS_E2E_LOGS_PATH` environment variable is set, its value is used (this will be used by CI so the log files can be uploaded as GitHub Actions artifacts). If the env var is unset, a tempdir is created and its path is logged.
Logs emitted by integration tests should get attached to actions runs for analysis now.
83c56e3
to
94cee06
Compare
leader: &Leader, | ||
mut test: Test, | ||
) { | ||
let result = test().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably use std::panic::catch_unwind()
here, both to ensure we emit logs on any panics we miss, and to make writing tests in this context more ergonomic. (We can use std::panic::panic_any()
to propagate resulting outer errors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried doing that, but std::panic::catch_unwind
can't be passed an async closure. The async equivalent is futures::future::FutureExt::catch_unwind
, but it requires UnwindSafe
, which the closure we want to test is decidedly not, and so I backed off from this approach and instead made submit_measurements_and_verify_aggregate
fallible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, of course. There are future wrappers that apply this to poll, FWIW: https://docs.rs/futures/latest/futures/future/struct.CatchUnwind.html
When tests run against interop containers fail, we now copy logs out of them so that they can be rolled up by GitHub Actions and attached to failing test runs as an artifact.
Part of #455