Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check only changed files #2

Open
laughedelic opened this issue Jun 1, 2018 · 1 comment
Open

Check only changed files #2

laughedelic opened this issue Jun 1, 2018 · 1 comment
Assignees
Milestone

Comments

@laughedelic
Copy link
Owner

Current simple implementation just lists all files in the repo (at a given commit) and checks their formatting. This may take a long time in big repos. So the plan is to check only the files changed in the PR (i.e. assuming that master stays well-formatted).

@laughedelic laughedelic added this to the v0.1.0 milestone Jun 1, 2018
@laughedelic laughedelic self-assigned this Jun 3, 2018
@laughedelic
Copy link
Owner Author

I tried to run it on the sbt/sbt repo and the current naive implementation (getting files content one by one) takes forever.

Here's an idea of how this should work:

  • every time a check is requested, instead of immediately checking the changed files, first get the check result for the parent commit (or the first ancestor that has a completed check associated)
  • if the parent's check was successful, just check the changed files
  • if the parent's check was unsuccessful, compare the changed files between the two commits and see if the mis-formattings from the parent commit were changed (and fixed), if so, only then proceed with checking the changes from head commit.
    • this could be more advanced, as the check can store annotations (which could be a list of misformatted files) and they can be retrieved. Then only those files have to be re-checked (if they were changed in the subsequent commit).

A simplified version to start with:

  • just find the closest ancestor commit with a successful check (up to the base branch ref, otherwise we'll have to check all files on the first pull-request)
  • compare the head with this ancestor to get all changed files
  • check only them

Other thoughts:

  • there should a time limit for a check run (checks even have timed_out completion status)
  • also a limit for the number of files to be checked at once (cancel the check with a neutral result and a warning message)
  • checks for the head and its parent commits (which may be several) could be triggered as different check runs in the same check suite, this would distribute the effort by queueing the runs (although I don't see an API to request a run, only a check suite)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant