You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If any test in CI gets stuck, the CI will be finished after 1 hour by buildkite termination. After that, the only way to understand what's happened is to:
Download stderr.
Find the latest line "test <test_name> ... test <test_name> has been running for over 60 seconds".
<test_name> is the test which got timeout.
This is very inconvenient way to get some basic knowledge about failed tests. I suggest to make tests handle their timeouts like catching_up.rs tests do OR (preferable) terminate them automatically after 60 seconds. I also propose a recommendation "if the test may work >60 seconds, put it to the Nightly, not CI". As we already can understand that test is running >60 seconds, it should be not a big deal to terminate it. :)
Ideally, I'd love to see friendly interface that says which tests failed with no grepping logs for specific substrings. Finding failed tests is routine operation that should be done many times per day, on each unsuccessful run of CI.
The text was updated successfully, but these errors were encountered:
I have a negative result here: in Rust, only cooperative cancellation is possible. That is, each test should enforce its own timeout (but of course we can abstract this in some kind of library function).
So I'd treat a hanging test as a bug in the test itself, and would fix the test to fail with a timeout error, before fixing the actual bug being exposed.
We can, of course, have some kind of a wrapper process which kills the whole cargo test if it doesn't observe passed/failed tests within a certain timeout, but my gut feeling is that enforcing timeouts from the inside, while being more work, will help with ironing out subtle concurrency bugs.
Separately, +1 for "if the test may work >60 seconds, put it to the Nightly, not CI". There are rust-lang/rust#75752 and rust-lang/rust#64663 (comment) which can help with finding out existing slow tests.
If any test in CI gets stuck, the CI will be finished after 1 hour by buildkite termination. After that, the only way to understand what's happened is to:
This is very inconvenient way to get some basic knowledge about failed tests. I suggest to make tests handle their timeouts like catching_up.rs tests do OR (preferable) terminate them automatically after 60 seconds. I also propose a recommendation "if the test may work >60 seconds, put it to the Nightly, not CI". As we already can understand that test is running >60 seconds, it should be not a big deal to terminate it. :)
Ideally, I'd love to see friendly interface that says which tests failed with no grepping logs for specific substrings. Finding failed tests is routine operation that should be done many times per day, on each unsuccessful run of CI.
The text was updated successfully, but these errors were encountered: