-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unblock PRs because of protected jobs/contexts #3423
Unblock PRs because of protected jobs/contexts #3423
Conversation
@owais please add some comments to the circleci.yaml file to document what happens when. I have a hard time understanding what runs for PRs what runs for "main", what uses secrets and what doesn't. Generally we need to document the build files better, should treat it like a code in a language that is not very self-documenting :-) |
@tigrannajaryan added some comments to the CI file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to break down CircleCI config.yml into smaller files and separate some of the concerns? I find it very hard to understand how the build works despite some very nice comments you added.
.circleci/config.yml
Outdated
# - github-release-and-issues-api-token | ||
# This context makes GITHUB_TOKEN available to jobs. Jobs can use the token to authenticate with the GitHub API. | ||
# We use this to report failures as issues back to the GitHub project. | ||
# Any member of the OpenTelemetry GitHub organisation can run jobs that require this context loadtest-with-github-reports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is a bit confusing. Did it mean to refer to loadtest-with-github-reports
? Is it an example of a job that uses this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Fixed.
.circleci/config.yml
Outdated
# - dockerhub-token | ||
# This contexts makes DOCKER_HUB_USERNAME and DOCKER_HUB_PASSWORD environment variables available to the jobs. | ||
# This is used to publish docker images to Docker Hub. | ||
# Only project approvers and maintainers can run jobs that depend on this context such as publish-stable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we further limit this to maintainers only or it would make impossible to make releases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll mean only maintainers will be able to release. We can create another group called release-engineers
or something like that and add some maintainer and approver to it, then limit this context to that group.
BTW I need help setting this up as I'm not an admin (pinged you on slack).
@@ -152,6 +175,32 @@ commands: | |||
command: issuegenerator ${TEST_RESULTS} | |||
when: on_fail | |||
|
|||
run_loadtest: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use _ or - as separator in job names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commands (such as run_loadtest) use _
while as jobs use -
. I tried to follow this existing pattern.
Commands are a set of re-usable step definitions that can be used by jobs.
Does not need to be this PR, but I wonder if this is possible at all. If not then maybe we refactor/rewrite the build config file to make it more understandable (not in this PR). |
Maybe we need a diagram or something that explains the workflows, jobs, steps, etc summarized somewhere so that we don't have to read the entire file to understand what is going on. |
This all makes sense and I agree with @tigrannajaryan on the diagram. I am curious how much time we'll want to invest in this given the eventual migration to GitHub Actions? I know we have some work to do before then, but a couple vendor distributions of the collector (including our F5 distribution), either have or are currently working on GitHub workflow pipelines, so I feel that there won't be too much heavy lifting when we're ready to migrate. |
- checkout | ||
- restore_module_cache | ||
- run: | ||
name: Upgrade golang | ||
command: | | ||
choco upgrade golang --version=1.15 | ||
refreshenv | ||
- run: | ||
name: Unit tests | ||
command: (Get-Childitem -Include go.mod -Recurse) | ForEach-Object { cd (Split-Path $_ -Parent); go test ./...; if ($LastExitCode -gt 0) { exit $LastExitCode } } | ||
- save_module_cache | ||
- github_issue_generator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved these steps to the re-usable run_windows_test
command.
- restore_workspace | ||
- install_fluentbit | ||
- run: | ||
name: Loadtest | ||
command: make e2e-test | ||
- store_artifacts: | ||
path: testbed/tests/results | ||
- store_test_results: | ||
path: testbed/tests/results/junit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved these steps to the re-usable run_loadtest
command.
# - github-release-and-issues-api-token | ||
# This context makes GITHUB_TOKEN available to jobs. Jobs can use the token to authenticate with the GitHub API. | ||
# We use this to report failures as issues back to the GitHub project. | ||
# Any member of the OpenTelemetry GitHub organisation can run jobs that require this context e.g, loadtest-with-github-reports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me why member of OT org would need access to this. I thought the loadtest that creates github issue only ran on main branch and not as part of any PRs? I don't think we've ever created GH issues as part of tests failing in a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the loadtest that creates github issue only ran on main branch and not as part of any PRs? I don't think we've ever created GH issues as part of tests failing in a PR.
This is true. It only runs on main but now that it uses Contexts to inject GITHUB_TOKEN into the job, we need to enable the context specifically for this job in the pipeline. As soon as we do that, the job can only be executed for commits pushed by people who are allowed to use the context which by default is all members of the OpenTelemetry Github org. So we now have two identical jobs for loadtest (and windows-test). One uses contexts and runs on main and the other does not use context and runs on PRs. This unblocks PRs from the wider community.
I don't know if I can do any better than CircleCI :)
Release jobs would be the same as main jobs but will run |
This makes sense to me. I think we should merge this to unblock PRs and then work towards moving to Github Actions and resolve any remaining issues there. |
…ring metrics (#3423) Adds the staleness store to the metricsBuilder and transaction to track when metrics disappear between scrapes. Depends on open-telemetry/opentelemetry-collector#3414 Fixes #3413 Fixes #2961
… own (#5062) * Revert "receiver/prometheus: glue and complete staleness marking for disappearing metrics (#3423)" This reverts commit 8b79380. * Revert "receiver/prometheus: add store to track stale metrics (#3414)" This reverts commit cdc1634. * stop dropping staleness markers from prometheus, and fix tests * add staleness end to end test from #3423
… own (#5062) (#5170) * Revert "receiver/prometheus: glue and complete staleness marking for disappearing metrics (#3423)" This reverts commit 8b79380. * Revert "receiver/prometheus: add store to track stale metrics (#3414)" This reverts commit cdc1634. * stop dropping staleness markers from prometheus, and fix tests * add staleness end to end test from #3423
This unblocks some PR jobs from not running right now.
We recently switch to CircleCI project environment variables to CircleCI Org Contexts. This allows us to restrict secrets like tokens to Docker Hub and Github to specific CI jobs and Github teams. Today the contexts can be used by any OpenTelemetry GitHub org member. By used, I mean any Otel member can push a commit and run any of the jobs that depend on the secrets provided by CircleCI contexts such as DockerHub (publish publish-dev) and GitHub (loadtest, windows-test).
While this adds much needed additional protection to secrets used by the project, it also means forked PRs from non-members will not be able to run the jobs that depend on any secrets such as loadtest and windows-test.
This PR re-structures the jobs a bit so that forked PRs don't need access to any secrets at all. This change ensures that only jobs triggered by commits to the main branch and release tags can run jobs that depend on secrets.
Merging this PR will unblock other PRs from non-members such as: #3422
This will l allow PR builds from non-opentelemetry members to pass and allow them to be merged but these builds will still fail once they're merged to main. I'll work with the maintainers to solve this for main branch builds.