ci: move a chunk of the Rust CI over to GHA #9290

nagisa · 2023-07-12T12:53:59Z

GHA has many significant benefits compared to buildkite, most major of it is that blindly using GHA is not going to by default allow untrusted contributors to run arbitrary code – instead GitHub will present reviewers a button they can click after reviewing the PR.

It also makes maintenance of the CI infrastructure somewhat easier, which given our soon-to-be-stretched-very-thin infrastructure team is a huge benefit.

In process of implementing this PR I ended up simplifying a lot of the Rust testing as well. In particular instead of running half a dozen of different combinations on just Linux, we now run just the nightly vs non-nightly versions. This saves CPU time on many different rebuilds… Of note, one feature that no longer gets tested is mock_node, as it was causing a failure in one of the integration tests. If we wanted to re-enable this particular test, we should figure out how to fix the test, rather than adding a new test configuration.

As a final benefit, I’ve also added a m1 macOS-based job. This should help with making sure that people who develop on company-issued laptops can actually be productive, rather than have to tip-toe around a boatload of failing tests. We will have an ability to decide whether we want to block PRs landing on this job in the repository configuration at any point in the future.

See also #9608

nagisa · 2023-07-12T12:55:23Z

I guess the check for allowing changes to pipeline.yml work a little too well :D

test-utils/style/src/lib.rs

…t` (#9608) I got absolutely fed up with waiting for prerequisite infrastructure work for #9290 (why does it have to be _that_ hard?) However the PR in question had some other important improvements that do not necessarily *rely* on changes to said infrastructure to work. In particular: * `nextest` now retries failing tests a few times before giving up, to make sure they aren't spuriously failing; * This should help some timing out integration tests in particular, as those often fail because of some deadlockish situation in my experience. * style checks still run with `cargo nextest` – unfortunately that means that in CI this will run these checks multiple times, but that doesn’t sound particularly terrible of a tradeoff (especially if the other changes mean we won't be retrying entire test suites as often anymore;) * This should allow for a greater portion of the test suite to run on Macs – unfortunately not verified by the CI, but people do complain and this should make the situation better. cc #9367

nagisa · 2023-10-02T10:43:44Z

.config/nextest.toml

@@ -1,8 +1,5 @@
 [profile.default]
 slow-timeout = { period = "60s", terminate-after = 2, grace-period = "0s" }
-# FIXME(nagisa): use --profile ci in CI instead when we manage to modify CI scripts...
-retries = { backoff = "fixed", count = 3, delay = "1s" }
-failure-output = "final"


This change may be contentious. On one hand keeping this in makes flaky tests less of a major PiTA during local development. On the other we might end up living with the flaky tests in perpetuity if we keep this as nobody would have any motivation to actually look into these flakies…

+1 to removing retries

…t` (#9608) I got absolutely fed up with waiting for prerequisite infrastructure work for #9290 (why does it have to be _that_ hard?) However the PR in question had some other important improvements that do not necessarily *rely* on changes to said infrastructure to work. In particular: * `nextest` now retries failing tests a few times before giving up, to make sure they aren't spuriously failing; * This should help some timing out integration tests in particular, as those often fail because of some deadlockish situation in my experience. * style checks still run with `cargo nextest` – unfortunately that means that in CI this will run these checks multiple times, but that doesn’t sound particularly terrible of a tradeoff (especially if the other changes mean we won't be retrying entire test suites as often anymore;) * This should allow for a greater portion of the test suite to run on Macs – unfortunately not verified by the CI, but people do complain and this should make the situation better. cc #9367

This is much better than denying warnings during build as even with warnings present we can see the rest of the test suite failures at the same time.

GHA has many significant benefits compared to buildkite, most major of it is that blindly using GHA is not going to by default allow untrusted contributors to run arbitrary code – instead GitHub will present reviewers a button they can click after reviewing the PR. It also makes maintenance of the CI infrastructure somewhat easier, which given our soon-to-be-stretched-very-thin infrastructure team is a huge benefit. In process of implementing this PR I ended up simplifying a lot of the Rust testing as well. In particular instead of running half a dozen of different combinations on just Linux, we now run just the nightly vs non-nightly versions. This saves CPU time on many different rebuilds… Of note, one feature that no longer gets tested is `mock_node`, as it was causing a failure in one of the integration tests. If we wanted to re-enable this particular test, we should figure out how to fix the test, rather than adding a new test configuration. As a final benefit, I’ve also added a m1 macOS-based job. This should help with making sure that people who develop on company-issued laptops can actually be productive, rather than have to tip-toe around a boatload of failing tests. We will have an ability to decide whether we want to block PRs landing on this job in the repository configuration at any point in the future.

…t` (#9608) I got absolutely fed up with waiting for prerequisite infrastructure work for #9290 (why does it have to be _that_ hard?) However the PR in question had some other important improvements that do not necessarily *rely* on changes to said infrastructure to work. In particular: * `nextest` now retries failing tests a few times before giving up, to make sure they aren't spuriously failing; * This should help some timing out integration tests in particular, as those often fail because of some deadlockish situation in my experience. * style checks still run with `cargo nextest` – unfortunately that means that in CI this will run these checks multiple times, but that doesn’t sound particularly terrible of a tradeoff (especially if the other changes mean we won't be retrying entire test suites as often anymore;) * This should allow for a greater portion of the test suite to run on Macs – unfortunately not verified by the CI, but people do complain and this should make the situation better. cc #9367

GHA has many significant benefits compared to buildkite, most major of it is that blindly using GHA is not going to by default allow untrusted contributors to run arbitrary code – instead GitHub will present reviewers a button they can click after reviewing the PR. It also makes maintenance of the CI infrastructure somewhat easier, which given our soon-to-be-stretched-very-thin infrastructure team is a huge benefit. In process of implementing this PR I ended up simplifying a lot of the Rust testing as well. In particular instead of running half a dozen of different combinations on just Linux, we now run just the nightly vs non-nightly versions. This saves CPU time on many different rebuilds… Of note, one feature that no longer gets tested is `mock_node`, as it was causing a failure in one of the integration tests. If we wanted to re-enable this particular test, we should figure out how to fix the test, rather than adding a new test configuration. As a final benefit, I’ve also added a m1 macOS-based job. This should help with making sure that people who develop on company-issued laptops can actually be productive, rather than have to tip-toe around a boatload of failing tests. We will have an ability to decide whether we want to block PRs landing on this job in the repository configuration at any point in the future.

frol · 2023-10-12T14:45:40Z

.buildkite/pipeline.yml

-      cargo install cargo-deny
-      cargo run -p themis --release
-      if [ -e deny.toml ]; then
-        cargo-deny --all-features check bans
-      fi


@nagisa Was it intentional to decommission cargo-deny and themis sanity checks?

They’re still being run as part of cargo nextest here as implemented in #9608.

GHA has many significant benefits compared to buildkite, most major of it is that blindly using GHA is not going to by default allow untrusted contributors to run arbitrary code – instead GitHub will present reviewers a button they can click after reviewing the PR. It also makes maintenance of the CI infrastructure somewhat easier, which given our soon-to-be-stretched-very-thin infrastructure team is a huge benefit. In process of implementing this PR I ended up simplifying a lot of the Rust testing as well. In particular instead of running half a dozen of different combinations on just Linux, we now run just the nightly vs non-nightly versions. This saves CPU time on many different rebuilds… Of note, one feature that no longer gets tested is `mock_node`, as it was causing a failure in one of the integration tests. If we wanted to re-enable this particular test, we should figure out how to fix the test, rather than adding a new test configuration. As a final benefit, I’ve also added a m1 macOS-based job. This should help with making sure that people who develop on company-issued laptops can actually be productive, rather than have to tip-toe around a boatload of failing tests. We will have an ability to decide whether we want to block PRs landing on this job in the repository configuration at any point in the future.

nagisa mentioned this pull request Jul 20, 2023

testing permissions to allow buildkite pipeline changes by codeowners #9022

Closed

nagisa mentioned this pull request Aug 11, 2023

[DX] There isn’t a single local command to ensure all CI will pass #9415

Closed

nagisa mentioned this pull request Sep 11, 2023

🔷 [2023-09] throughout which it will hopefully become more pleasant to wake up on workdays #9510

Closed

nagisa force-pushed the nagisa/cargo-tests-style branch 4 times, most recently from 409800f to 06a269c Compare September 11, 2023 11:48

nagisa commented Sep 11, 2023

View reviewed changes

test-utils/style/src/lib.rs Outdated Show resolved Hide resolved

nagisa force-pushed the nagisa/cargo-tests-style branch 9 times, most recently from 315191a to 4b56709 Compare September 21, 2023 08:04

nagisa force-pushed the nagisa/cargo-tests-style branch 3 times, most recently from e17417d to b2f700e Compare September 27, 2023 08:35

nagisa mentioned this pull request Sep 28, 2023

ci: improve test resilience and run style checks within cargo nextest #9608

Merged

nagisa force-pushed the nagisa/cargo-tests-style branch 2 times, most recently from 48da3c7 to 43fe737 Compare October 2, 2023 10:38

nagisa commented Oct 2, 2023

View reviewed changes

nagisa force-pushed the nagisa/cargo-tests-style branch 3 times, most recently from c55bb2c to 8670d05 Compare October 2, 2023 16:46

debug test_arena_mmap

ae5eb0c

nagisa force-pushed the nagisa/cargo-tests-style branch from b595d7d to ae5eb0c Compare October 5, 2023 10:57

Deny warnings in clippy check…

3a74f0c

This is much better than denying warnings during build as even with warnings present we can see the rest of the test suite failures at the same time.

nagisa force-pushed the nagisa/cargo-tests-style branch from 1a17509 to 3a74f0c Compare October 5, 2023 11:35

nagisa changed the title ~~ci: move Rust style checks to rust tests~~ ci: move a chunk of the Rust CI over to GHA Oct 5, 2023

nagisa added this pull request to the merge queue Oct 7, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 7, 2023

nagisa added this pull request to the merge queue Oct 7, 2023

Run the workflow in merge queue as well

aefb56b

nagisa removed this pull request from the merge queue due to a manual request Oct 7, 2023

nagisa enabled auto-merge October 7, 2023 09:06

nagisa added this pull request to the merge queue Oct 7, 2023

Merged via the queue into master with commit 7dc48f7 Oct 7, 2023

nagisa deleted the nagisa/cargo-tests-style branch October 7, 2023 10:00

frol reviewed Oct 12, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: move a chunk of the Rust CI over to GHA #9290

ci: move a chunk of the Rust CI over to GHA #9290

nagisa commented Jul 12, 2023 •

edited

Loading

nagisa commented Jul 12, 2023

nagisa Oct 2, 2023

wacban Oct 4, 2023

frol Oct 12, 2023

nagisa Oct 12, 2023

ci: move a chunk of the Rust CI over to GHA #9290

ci: move a chunk of the Rust CI over to GHA #9290

Conversation

nagisa commented Jul 12, 2023 • edited Loading

nagisa commented Jul 12, 2023

nagisa Oct 2, 2023

Choose a reason for hiding this comment

wacban Oct 4, 2023

Choose a reason for hiding this comment

frol Oct 12, 2023

Choose a reason for hiding this comment

nagisa Oct 12, 2023

Choose a reason for hiding this comment

nagisa commented Jul 12, 2023 •

edited

Loading