Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync infrastructure assets from upstream "templates" #205

Merged
merged 17 commits into from
Jul 20, 2023
Merged

Conversation

per1234
Copy link
Contributor

@per1234 per1234 commented Jul 20, 2023

Arduino tooling projects use a standardized infrastructure. A centralized collection of reusable infrastructure assets is maintained in a dedicated repository:

https://github.com/arduino/tooling-project-assets

Since the time this project's infrastructure was installed, some advancements have been made in the upstream "template" assets.

One such advancement was the update of the source URL for the taskfile JSON schema. It was necessary to pull that change into this repository to fix the spurious failure of the "Check Taskfiles" workflow runs:

https://github.com/arduino/libraries-repository-engine/actions/runs/5584881310/jobs/10206876396#step:5:11

While I was at it, I did a comprehensive sync of the project's infrastructure, bringing it up to date with the state of the art upstream assets. See the individual commit messages for explanations of the reasons for the various changes this introduced.

The workflow runs triggered by this PR will provide a basic validation of most of the changes. Unlike the rest of the workflows, the "Release" workflow is not triggered by the submission of a PR, so I did a demonstration release in my fork using the workflow with the changes that are proposed here:

github.com and wikipedia.org have different localization behaviors depending on the URL.

If a language code is specified via the URL, then that language version of the page is loaded, regardless of the
language setting of the user's browser or GitHub. For example, this URL will take the user to the English version of the
page even if their browser is configured for Chinese:

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-on-github

If no language code is specified via the URL, then it redirects to the version of the page localized for the user's
language preference, where available. For example, if the user has selected "Chinese" as their preferred language in
their browser settings, then this URL:

https://docs.github.com/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-on-github

redirects to:

https://docs.github.com/cn/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-on-github
`GITHUB_TOKEN` is an access token that is automatically generated and made accessible for use in GitHub Actions workflow
runs. The global default permissions of this token for workflow runs in a trusted context (i.e., not triggered by a
`pull_request` event from a fork) are set in the GiHub enterprise/organization/repository's administrative settings,
giving it either read-only or write permissions in all scopes.

In the case of a read-only default configuration, any workflow operations that require write permissions would fail with
an error like:

> 403: Resource not accessible by integration

In the case of a write default configuration, workflows have unnecessary permissions, which violates the security
principle of least privilege.

For this reason, GitHub Actions now allows fine grained control at a per-workflow or per-workflow job scope of the
permissions provided to the token. This is done using the `permissions` workflow key, which is used here to configure
the workflows for only the permissions require by each individual job.

I chose to always configure permissions at the job level even though in some cases the same permissions configuration
could be used for all jobs in a workflow. Even if functionally equivalent, I think it is semantically more appropriate
to always set the permissions at the job scope since the intention is to make the most granular possible permissions
configuration. Hopefully this approach will increase the likelihood that appropriate permissions configurations will be
made in any additional jobs that are added to the workflows in the future.

The automatic permissions downgrade from write to read for workflow runs in an untrusted context (e.g., triggered by a
`pull_request` event from a fork) is unaffected by this change.

Even when all permissions are withheld (`permissions: {}`), the token still provides the authenticated API request rate
limiting allowance (authenticating API requests to avoid rate limiting is a one of the uses of the token in these
workflows).

Read permissions are required in the "contents" scope in order to checkout private repositories. Even though those
permissions are not required when the workflows are installed in public repositories, the templates are intended to be
applicable in public and private repositories both and so a small excess in permissions was chosen instead of the
alternative of having to maintain separate variants of each workflow for use in public or private repos.
The trunk-based development strategy is used by some tooling projects (e.g., Arduino CLI). Their release branches may
contain a subset of the history of the default branch.

The status of the GitHub Actions workflows should be evaluated before making a release. However, this is not so simple as
checking the status of the commit at the tip of the release branch. The reason is that, for the sake of efficiency, the
workflows are configured to run only when the processes are relevant to the trigger event (e.g., no need to run unit
tests for a change to the readme).

In the case of the default branch, you can simply set the workflow runs filter to that branch and then check the result
of the latest run of each workflow of interest. However, that was not possible to do with the release branch since it
might be that the workflow was never run in that branch. The status of the latest run of the workflow in the default
branch might not match the status for the release branch if the release branch does not contain the full history.

For this reason, it will be helpful to trigger all relevant workflows on the creation of a release branch. This will
ensure that each of those workflows will always have at least one run in the release branch. Subsequent commits pushed to
the branch can run based on their usual trigger filters and the status of the latest run of each workflow in the branch
will provide an accurate indication of the state of that branch.

Branches are created for purposes other than releases, most notably feature branches to stage work for a pull request.
Because the collection of workflows in a Tooling project are often very comprehensive, it would not be convenient or
efficient to fully run them on the creation of every feature branch.

Unfortunately, GitHub Actions does not support filters on the `create` event of branch creation like it does for the
`push` and `pull_request` events. There is support for a `branches` filter of the `push` event, but that filter is an AND
to the `paths` filter and this application requires an OR. For this reason, the workflows must be triggered by the
creation of any branch. The unwanted job runs are prevented by adding a `run-determination` job with the branch filter
handled by Bash commands. The other jobs of the workflow use this `run-determination` job as a dependency, only running
when it indicates they should via a job output. Because this minimal `run-determination` job runs very quickly, it is
roughly equivalent to the workflow having been skipped entirely for non-release branch creations.

Release branches are not used in this project at this time so the changes are not relevant here. However, the
infrastructure maintenance is easiest if the number of modifications to the "template" workflows is kept to the minimum,
so the changes are pulled from the upstream files even when not needed.
GitHub Actions provides the capability for workflow authors to use the capabilities of the GitHub Actions ToolKit
package directly in the `run` keys of workflows via "workflow commands". One such command is `set-output`, which allows
data to be passed out of a workflow step as an output.

It has been determined that this command has potential to be a security risk in some applications. For this reason,
GitHub has deprecated the command and a warning of this is shown in the workflow run summary page of any workflow using
it:

The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more
information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/

The identical capability is now provided in a safer form via the GitHub Actions "environment files" system. Migrating
the use of the deprecated workflow commands to use the `GITHUB_OUTPUT` environment file instead fixes any potential
vulnerabilities in the workflows, resolves the warnings, and avoids the eventual complete breakage of the workflows that
would result from GitHub's planned removal of the `set-output` workflow command.
…sed installation

The "Licensed" tool is used to check the project's compatibility with licensing of its dependencies.

This tool is installed by the "Check Go Dependencies" and "Check npm Dependencies" GitHub Actions workflows using the
`jonabc/setup-licensed` GitHub Actions action. This action attempts the installation according to the following
procedure:

1. Install the Ruby gem.
2. If gem installation fails, install the release asset from the `github/licensed` repo.

Spurious failures of the runs of these workflows are occurring due to hitting the rate limit during the attempt to
install the release asset via the GitHub API in step (2).

The error message shown in the workflow run logs when this failure occurs:

> Error: API rate limit exceeded for 104.45.203.178. (But here's the good news: Authenticated requests get a higher rate
> limit. Check out the documentation for more details.)

suggests the rate limiting could be avoided by providing an authentication token for the GitHub API request. However,
the workflow already does this, and it is used by the action, but intentionally not for this specific API request.

The problem would be avoided entirely if the gem installation at step (1) was successful. It was failing with the
following error shown in the workflow run logs:

> ERROR:  While executing gem ... (Gem::FilePermissionError)
>     You don't have write permissions for the /var/lib/gems/3.0.0 directory.
> gem installation was not successful

This failure can be avoided by setting up an accessible installation of Ruby in the runner machine, which is
accomplished using the `ruby/setup-ruby` action in a step preceding the `jonabc/setup-licensed` step in the workflow.
In order to catch breakage caused by external changes, the project's GitHub Actions workflows have a schedule event
trigger.

Previously, all the triggers were configured to trigger at the same time. It seems that this might cause spurious
workflow run failures due to rate limiting or anti-DDoS systems. In order to avoid that, the triggers are configured to
occur at various times.
It is often necessary to escape the newline for continuation when breaking shell commands into multiple lines for
readability. However, if a line break occurs in a shell command in an unterminated command, the continuation is implicit.

The escaping in these shell commands was unnecessary and only made them a bit more cryptic, since the brackets already
clearly indicate the structure
The GitHub Actions workflows are typically configured to run whenever any relevant file in the repository is changed.
However, the results of the workflow run are also dependent on the external environment it runs in, which include:

- The software running on the GitHub hosted GitHub Actions runner machines
- The GitHub Actions actions used by the workflow
- The dependencies that are installed by the workflow directly or via the GitHub Actions actions it uses

The workflows often do not fully pin to a specific version of given external tool. This was done in the interest on
reducing the maintenance burden of keeping the systems up to date. However, it also means that a new release can cause
the workflow runs to start failing (which might happen through an enhancement to that resource resolving a false
negative, or a defect causing a false negative).

When the repository file path trigger is used by itself, this sort of external breakage is only revealed when an
unrelated change triggers the workflow. That can be distracting even to a dedicated member of the project development
team, as well as confusing and discouraging to any contributor.

This type of change can be caught by adding a `schedule` event trigger that causes the workflow to run periodically in
addition to the other on-demand triggers. This allows the problem to be identified and resolved at the maintainer's
convenience, separate from the unrelated development work.
The "Check Taskfiles" template workflow validates the project's taskfiles against the official JSON schema.

The contents of that schema was recently moved from the previous location in the Schema Store to the Task project
repository:

go-task/task#910

An attempt was made to mitigate the impact of that move by replacing the content in the original schema with an external
reference to the new one. However, the schema validator used by the workflow does not automatically follow external
references, which caused the workflow to fail:

schema /home/runner/work/_temp/taskfile-schema/taskfile.json is invalid
error: can't resolve reference https://taskfile.dev/schema.json from id #

Although this could be resolved by also downloading the referenced schema, since the original schema intentionally does
not contain anything of value, the better fix is to simply use the real schema directly in the workflow.
In order to provide coverage for projects using release branches, the "Publish Tester Build" workflow has a `create`
event trigger. This trigger is intended to cause the workflow to run on the creation of a release branch.

The same `create` event also occurs when a tag is pushed to the repository. There is no need to generate a tester build
for tags because these are only made after the project is in a fully tested state and the tag push will trigger a
release build that makes a generated tester build superfluous anyway. For this reason, and because the triggering of the
tester build on this event can cause problems in some project, the workflow is configured to skip the generation of the
tester build when it was triggered by a tag push.
…on paths

For the sake of efficiency, the "Test Go" GitHub Actions workflow is configured to run only when relevant files are
modified. Since the workflow uploads code coverage data to Codecov, the Codecov configuration file is one of these files.

The standard filename for the Codecov configuration file is codecov.yml, and the workflow's path filter was configured
for that filename. It turns out an alternative filename is also recognized: .codecov.yml. Two subfolders are also
supported in addition to the root of the repository as locations for the configuration file.

The workflow's paths filter was not configured for the alternative filename and locations, meaning the workflow would
not be triggered on change to the Codecov configuration in projects that use the alternative configuration file name or
locations.

The workflow's paths filter is hereby configured to recognize changes to any valid Codecov configuration file.
Some of the project's development tool dependencies are sourced from the npm software registry.

Previously, the version of the tools used was not controlled. This was problematic because:

- A different version of the tool may be used on the contributor's machine than on the CI runner, resulting in confusing
  failures.
- The project is immediately subject to disruption or breakage resulting from a release of the tool.

---

These tools were installed via either of the following methods:

`npx <pkg>`

This approach has the following behaviors of interest:

https://docs.npmjs.com/cli/v8/commands/npx#description

> If any requested packages are not present in the local project dependencies, then they are installed to a folder in
> the npm cache, which is added to the PATH environment variable in the executed process.

> Package names provided without a specifier will be matched with whatever version exists in the local project. Package
> names with a specifier will only be considered a match if they have the exact same name and version as the local
> dependency.

This means that the version used was:

1. Whatever happens to be present in the local cache
2. The latest available version if it is not already present

`npm install --global <pkg>`

The latest available version of the package is used.

---
`
The new approach is to specify the version of the tools via the standard npm metadata files (package.json +
package-lock.json). This approach was chosen over the `npx <pkg>@<version>` alternative for the following reasons:

- Enables automated updates via Dependabot PRs
- Enables automated vulnerability alerts
- Separates dependency management from the asset contents (i.e., no need to mess with the taskfile or workflow on every
  update)
- Matches how we are already managing Python dependencies (pyproject.toml + poetry.lock)
…com links

The link check job of the "Check Markdown" workflow had frequent intermittent spurious failures recently, caused by
links under the docs.github.com domain returning 403 HTTP status.

Others experiencing the same problem reported that they were able to work around the problem by providing a custom
`Accept-Encoding` HTTP request header. Although I was not able to find any explanation of why, it did appear to resolve
the problem.

This is a change that was made in the upstream "template" this file is based on. I don't know whether or not the
workaround is still necessary (I haven't experienced any spurious link check failures from this domain recently, but
that might be due to the workaround having been implemented in many of the repositories), but it does no harm so it is
best to sync with the upstream "template" file regardless.
120 columns is the recommended line length for YAML code in Arduino tooling projects. The yamllint tool used by the
"Check YAML" template produces a warning when a line exceeds this length.

This is not a hard limit and in some cases it is either impossible or not beneficial to make lines less than 120 in
length so some violations of the guideline are unavoidable. However, a survey of the YAML files in the repository
revealed some opportunities for improving the code by reducing the lengths.
This folder contains only generated files, so should not be processed by Prettier.
The `go:build` task has an `LDFLAGS` taskfile template variable, which is currently used by some projects to inject the
versioning information while building via an `-ldflags` flag.

`go test` has undocumented support for an `-ldflags` flag as well. Projects might find it useful to be able to set the
value of string variables in the code while running the tests. To provide support by the template task for this use case,
a `TEST_LDFLAGS` taskfile template variable is added to the `go test` command in the `go:test` task, as well as empty
definition of the variable. This can be left empty if the project doesn't have any need for the capability.
For some types of dependency management frameworks, it is necessary to run some operation before the check for
unapproved dependency license types.

No preparation is needed for Go module-based projects like this one, so the preparation task is left empty, but this is
a copy of a "template" asset that is intended to be a "template" that is generally applicable to projects of any type so
the change is brought solely in order to keep the file in sync with the upstream source.
@per1234 per1234 added type: enhancement Proposed improvement topic: infrastructure Related to project infrastructure type: imperfection Perceived defect in any part of project labels Jul 20, 2023
@per1234 per1234 self-assigned this Jul 20, 2023
@codecov-commenter
Copy link

Codecov Report

Patch and project coverage have no change.

Comparison is base (d8e967b) 42.54% compared to head (7814917) 42.54%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #205   +/-   ##
=======================================
  Coverage   42.54%   42.54%           
=======================================
  Files          26       26           
  Lines        1455     1455           
=======================================
  Hits          619      619           
  Misses        775      775           
  Partials       61       61           
Flag Coverage Δ
unit 42.54% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@per1234 per1234 merged commit 6c95286 into main Jul 20, 2023
121 checks passed
@per1234 per1234 deleted the sync-infrastructure branch July 20, 2023 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: infrastructure Related to project infrastructure type: enhancement Proposed improvement type: imperfection Perceived defect in any part of project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants