Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Revisit Nightly CI Scheduled Run Frequency #1122

Closed
noahpb opened this issue Dec 13, 2024 · 0 comments · Fixed by #1124
Closed

chore: Revisit Nightly CI Scheduled Run Frequency #1122

noahpb opened this issue Dec 13, 2024 · 0 comments · Fixed by #1124
Labels
ci Issues pertaining to CI / Pipelines / Testing

Comments

@noahpb
Copy link
Contributor

noahpb commented Dec 13, 2024

Describe what should be investigated or refactored

Currently, our nightly CI runs 3 jobs per 3 different k8s distros on a nightly basis. These workflows take anywhere from 30-80min to run and incur costs. Failures to date have been related to infrastructure specific misconfigurations or failures related to the runner or external dependencies (timeouts, rate limits, etc). Given the consistent failures unrelated to uds-core, there has not been a ton of value added from nightly CI runs. @mjnagel and I suggest pivoting to running 2/3x weekly and before releases as an alternative.

Links to any relevant code

Nightly CI workflows under .github/workflows.

Additional context

It's expensive. We should optimize our CI strategy to ensure we're measuring the correct results. Relates to #777, #784, and #1121.

@noahpb noahpb added the ci Issues pertaining to CI / Pipelines / Testing label Dec 13, 2024
mjnagel added a commit that referenced this issue Dec 17, 2024
## Description

Switches nightly CI to weekly, but also adds `milestoned` trigger to run
this CI pre-release. To date we have seen minimal issues with uds-core
in these environments that have actually been caused by core issues.
Reducing this CI to weekly + before release will still ensure that we
can catch issues, but also optimize our time/cost.

I also added a modification to the cluster wait script to wait on jobs.
Some of our recent CI has failed due to metallb not being installed
completely yet (it is installed via a rancher `helmchart` CR, which
creates a job). Example failure this should resolve:
https://github.com/defenseunicorns/uds-core/actions/runs/12327456734/job/34409486515

## Related Issue

Fixes #1122

## Type of change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Other (security config, docs update, etc)

## Checklist before merging

- [x] Test, docs, adr added or updated as needed
- [x] [Contributor
Guide](https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md)
followed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Issues pertaining to CI / Pipelines / Testing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant