Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use git command instead of GitHub API to list modified files #10106

Merged
merged 5 commits into from
Apr 20, 2022

Conversation

frouioui
Copy link
Member

@frouioui frouioui commented Apr 19, 2022

Description

This Pull Request aims to fix the API Rate error we were observing with the new dorny/paths-filter integration. We recently implemented a feature that skips workflows based on the list of modified files, using dorny/paths-filter, however, when several commits were pushed at the same time we observed the following error:

Error: API rate limit exceeded for installation ID 5040149.

We think this can be alleviated by simply not using the GitHub API. In fact, even if dorny/paths-filter was making only API call per workflow, we would easily reach the 1000 requests/hour limit with over 80 workflows. Simply push 13 commits in the Vitess organization within the same hour and the limit would be reached.

When using GITHUB_TOKEN, the rate limit is 1,000 requests per hour per repository. For requests to resources that belong to an enterprise account on GitHub.com, GitHub Enterprise Cloud's rate limit applies, and the limit is 15,000 requests per hour per repository.

In dorny/paths-filter's README, the usage of the tool is documented as follows:

- uses: dorny/paths-filter@v2
  with:
    
    ...
    
    # Personal access token used to fetch a list of changed files
    # from GitHub REST API.
    # It's only used if action is triggered by a pull request event.
    # GitHub token from workflow context is used as default value.
    # If an empty string is provided, the action falls back to detect
    # changes using git commands.
    # Default: ${{ github.token }}
    token: ''

Setting token to an empty string should therefore force the use of the git command instead of the GitHub API. After trying to set the token to an empty string, there was a weird behavior where running the following git command would return a large list of modified files (even though they are not modified by the PR).

git log --format= --no-renames --name-status -z -n 1 

In this workflow, we can see that paths-filter is attempting to calculate the diff using the git command. However, as we can see in the screenshot below, they find over 4000 modified files, which does not correspond to the changes made by this commit or the other commits on the PR's branch.

image

The git log --format ... command is used by default in dorny/paths-filter when token is set to an empty string. To fix this issue, I forked dorny/paths-filter (here) and pushed a commit that fixes the issue. I changed all the workflows to use this new fork.

The fix consists of doing a diff using the following command:

git diff --no-renames --name-status -z {BASE_SHA}..{HEAD_SHA}

That way we only capture the difference that happened between the PR's base and the current commit.

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation was added or is not required

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
@frouioui frouioui force-pushed the gh-api-limit-paths-filter branch from 5dd263c to 5be6559 Compare April 19, 2022 08:29
@frouioui frouioui marked this pull request as draft April 19, 2022 08:33
@frouioui frouioui force-pushed the gh-api-limit-paths-filter branch 2 times, most recently from 6103d3a to 5be6559 Compare April 19, 2022 08:48
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
@frouioui frouioui force-pushed the gh-api-limit-paths-filter branch 7 times, most recently from 88bdea4 to 12823e2 Compare April 19, 2022 12:01
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
@frouioui frouioui force-pushed the gh-api-limit-paths-filter branch from 12823e2 to 125b391 Compare April 19, 2022 12:25
@frouioui frouioui marked this pull request as ready for review April 19, 2022 12:47
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Can we open an upstream PR if we haven't yet? The less we have to maintain the better :-)

Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work!
+1 to @mattlord's suggestion of opening an upstream PR.

@deepthi
Copy link
Member

deepthi commented Apr 19, 2022

Not merging this right now, because I believe @frouioui intends to rename the master branch to main on his fork and change all the files to refer to that.

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
@frouioui
Copy link
Member Author

Pull Request to upstream is available dorny/paths-filter#133.

…filter

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
@frouioui frouioui merged commit 19bceba into vitessio:main Apr 20, 2022
@frouioui frouioui deleted the gh-api-limit-paths-filter branch April 20, 2022 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants