-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out-of-order commits on different PRs result in skipped builds #26
Out-of-order commits on different PRs result in skipped builds #26
Comments
With some hack-n-slash debugging, managed to capture the following debug statement
From the following code
|
Thanks for raising the issue and all the debugging you have done! This issue is slightly related to https://github.com/itsdalmo/github-pr-resource/issues/22, because I had to switch to using With that in mind, I think GraphQL and
As you say, the first Obviously this is not an ideal solution, but I've struggled to find an alternative in the issue I linked above. Perhaps the solution is to just skip filtering on timestamps all together, and output versions that have been discovered before, and just let Concourse figure out which ones are new? (I believe this is how the old github-pullrequest-resource does it). Thoughts? |
Agreed, the evidence I'm seeing all points to behaves-as-expected given the timestamps and order of operations For our use case, we would be fine with latest-commit-for-each-PR as the array of versions, without any timestamp filtering...just giving that a test now. Do you know of a good resource that would explain how Concourse would handle this? |
Ah, I think I understand now how this could work with out-of-order commits Add an extra field to each
|
Testing with a release candidate of #28, seeing every commit trigger a build regardless of order of submission. Concourse looks happy, even though the |
Thanks for filing the issue, finding a solution and submitting a PR! 👏 That being said, this all feels like a very hacky solution to me since we essentially end up in a situation where every version also includes a copy of all other versions (minus the commit SHA), which is something that is supposed to be handled by Concourse itself. I'm leaning towards removing
|
No worries, happy to contribute! If the resource always emits the latest commit for all PRs, what sort order is chosen? This is not a quote, but my understanding of how resource versions are treated
Does that sound right? If so, it implies to me that sorting needs to occur on something other than PR Number, commit hash, committed date, or any other metadata that is not semantically equivalent to "has this PR+commit already been seen by Concourse" |
I'm really glad to see this is a known issue. I am contemplating migrating to this resource from The modified |
I'm currently evaluating this resource as well and I think I've experienced an issue related to this one. My test repository has a few old branches, I opened a PR for one of the most recent one, the webhook succeeded in triggering the resource, all was well. Then I opened a second PR for an older branch and then nothing happened, the resource would not see it. I had to rebase/amend and force-push that branch, updating the commit date, to finally make the resource pick it up. |
Just want to clarify that a similar issue existed on the old resource |
I'm seeing the same issue but I'm not sure how to debug it, nor if it's related to out of order commits. Developers work in their forks, and PRs in our repo are almost always have just one commit. In case changes to a PR requested, the developer just force-pushes his branch. Maybe the issue arises somewhere between checks when a developer force-pushes. |
Have the same issue. Do you have any plans to fix this or maybe someone finds the workaround? |
@insider89 - I think the following is the only workaround:
However, I have not given it much thought since I made that comment; I've sort of been holding out to see how Resources v2 and spaces would affect this before making any changes. |
Hello; we use concourse to build a medium sized project and are seeing skipped PR checks. We did not manage to measure the impact of this, but it's definitely noticeable -- is there any way to workaround this? From what I've seen in #22, would be "using pushDate, if it exist; otherwise resorting to committedDate" an option? |
I'm currently seeing skipped PRs as well, and suspect maybe it's related to this. It's a little unclear on the primary documentation whether or not all open PRs are supposed to be surfaced. That's not what we're seeing and it's very confusing to developers. |
I'm seeing this issue, which seems related: Ref 'A' is pushed. The developer rebases, which puts ref 'B' above ref 'A' in the The effect of this, though, is confusing to the developer. They can see that the |
@itsdalmo note that this issue manifests quite often at our team, basically forcing us to re-trigger PR checks quite often (on daily to weekly basis) by empty commits/force pushes. There's also some data that e.g. DigitalOcean noticed the issue. The "Spaces" might not happen in the near future and although Resources v2 sound like a way to properly tackle this issue, it still might take some time to land in prod release. Would there be any reasonable way to work around this for the time being by e.g. applying DO's fix? |
@itsdalmo Is there any update on this issue? Concourse v5.8.0 has come out with the first precursor work on {{spaces}} but we're a long way from that being production ready. |
Would like thoughts on the idea of adding to the {
"source": {
"repostiroy": "git://some-uri",
"owner": "develop",
"status_context": "concourse-ci/build"
},
...
} What I'm thinking is when you run test/build on a PR/commit you usually push back a status on the commit to indicate its pending/success/failure. So instead of filtering out commits based on date we filter out commits based on them already having a status/context set on them. I'm seeing that when a commit doesn't have a matching context name for a status on the commit it returns "status": {
"context": null
} Otherwise you get some values: "status": {
"context": {
"context": "concourse-ci/build",
"state": "SUCCESS"
}
} We would need to update the graphql query to include returning the context for a commit. Not sure how much more costly that makes the query. I'll play with this idea ... but let me know any initial thoughts you may have on it. |
I've opened PR #189 with my stab at implementing my above comment. Let me know your thoughts! |
This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. Rather than attempt to find a different way of tracking which PRs are "new" given an input version, we can remove the date-based filtering and return all open PRs. Concourse can deduplicate versions based on metadata, which means that we will only trigger new jobs for versions that Concourse hasn't seen before. This makes it easier for teams to use this resource to track PRs in Concourse, since they no longer have to ensure that a PR has a later head commit than all currently-opened PRs in order to notify Concourse that their PR exists.
This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. Rather than attempt to find a different way of tracking which PRs are "new" given an input version, we can remove the date-based filtering and return all open PRs. Concourse can deduplicate versions based on metadata, which means that we will only trigger new jobs for versions that Concourse hasn't seen before. This makes it easier for teams to use this resource to track PRs in Concourse, since they no longer have to ensure that a PR has a later head commit than all currently-opened PRs in order to notify Concourse that their PR exists.
This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. Rather than attempt to find a different way of tracking which PRs are "new" given an input version, we can remove the date-based filtering and return all open PRs. Concourse can deduplicate versions based on metadata, which means that we will only trigger new jobs for versions that Concourse hasn't seen before. This makes it easier for teams to use this resource to track PRs in Concourse, since they no longer have to ensure that a PR has a later head commit than all currently-opened PRs in order to notify Concourse that their PR exists.
This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. Rather than attempt to find a different way of tracking which PRs are "new" given an input version, we can remove the date-based filtering and return all open PRs. Concourse can deduplicate versions based on metadata, which means that we will only trigger new jobs for versions that Concourse hasn't seen before. This makes it easier for teams to use this resource to track PRs in Concourse, since they no longer have to ensure that a PR has a later head commit than all currently-opened PRs in order to notify Concourse that their PR exists.
This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. Rather than attempt to find a different way of tracking which PRs are "new" given an input version, we can remove the date-based filtering and return all open PRs. Concourse can deduplicate versions based on metadata, which means that we will only trigger new jobs for versions that Concourse hasn't seen before. This makes it easier for teams to use this resource to track PRs in Concourse, since they no longer have to ensure that a PR has a later head commit than all currently-opened PRs in order to notify Concourse that their PR exists.
This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. Rather than attempt to find a different way of tracking which PRs are "new" given an input version, we can remove the date-based filtering and return all open PRs. Concourse can deduplicate versions based on metadata, which means that we will only trigger new jobs for versions that Concourse hasn't seen before. This makes it easier for teams to use this resource to track PRs in Concourse, since they no longer have to ensure that a PR has a later head commit than all currently-opened PRs in order to notify Concourse that their PR exists.
This supersedes #205. This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. In #205, I removed the date filter entirely. This ensures that the PR resource will find all PRs that match the explicitly-configured filters. While Concourse can detect and ignore duplicate versions, it has to run a database query for every version returned by a `check`, so removing the date filter entirely would increase load on a Concourse database. (That said, I'm not sure whether this increased load is a particular concern, and other resources don't seem to make much effort to avoid returning duplicate versions from a `check`.) To avoid that extra load on a Concourse database, this change instead replaces the filter by commit date in `check.go` with a filter by updated date in the GraphQL query to list pull requests. This should reduce the number of duplicate versions returned by a `check` while still allowing the PR resource to detect PRs with out-of-order head commits.
This supersedes #205. This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. In #205, I removed the date filter entirely. This ensures that the PR resource will find all PRs that match the explicitly-configured filters. While Concourse can detect and ignore duplicate versions, it has to run a database query for every version returned by a `check`, so removing the date filter entirely would increase load on a Concourse database. (That said, I'm not sure whether this increased load is a particular concern, and other resources don't seem to make much effort to avoid returning duplicate versions from a `check`.) To avoid that extra load on a Concourse database, this change instead replaces the filter by commit date in `check.go` with a filter by updated date in the GraphQL query to list pull requests. This should reduce the number of duplicate versions returned by a `check` while still allowing the PR resource to detect PRs with out-of-order head commits.
This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. Rather than attempt to find a different way of tracking which PRs are "new" given an input version, we can remove the date-based filtering and return all open PRs. Concourse can deduplicate versions based on metadata, which means that we will only trigger new jobs for versions that Concourse hasn't seen before. This makes it easier for teams to use this resource to track PRs in Concourse, since they no longer have to ensure that a PR has a later head commit than all currently-opened PRs in order to notify Concourse that their PR exists. (cherry picked from commit 50ef79a)
This reverts commit fd89b79.
Close #26: Filter PRs by updated date instead of commit date
This resource can inadvertently miss Pull Requests due to out-of-order commits across PRs. If PR#2 is opened after PR#1, but the head commit of PR#2 is older than the head commit of PR#1, the resource will not include PR#2 in the list of new versions provided to Concourse. Rather than attempt to find a different way of tracking which PRs are "new" given an input version, we can remove the date-based filtering and return all open PRs. Concourse can deduplicate versions based on metadata, which means that we will only trigger new jobs for versions that Concourse hasn't seen before. This makes it easier for teams to use this resource to track PRs in Concourse, since they no longer have to ensure that a PR has a later head commit than all currently-opened PRs in order to notify Concourse that their PR exists.
We've been evaluating this resource, and have observed skipped CI builds "randomly"
Broadly, I'm looking for a path to more robust handling of potentially out-of-order commit timestamps
Our pipeline is fairly standard, and triggered via webhooks
After further investigation, I've managed to reproduce reliably with the following workflow in a test repository (commits with timestamps separated by 1 second):
Also reproduces the "skipped build", when the order is reversed (so nothing special about a particular branch):
What does not reproduce (commits with timestamp to the same second):
I'm curious about the logic on https://github.com/itsdalmo/github-pr-resource/blob/master/check.go#L38, and wondering if the Github GraphQL is potentially mangling the commit timestamps, or perhaps does not guarantee the commits to appear across PRs in the same moment?
I.e. perhaps when the GraphQL API is queried from the first webhook-initiated
check
, PR-2 contains the latest commit, but PR-1 does not (perhaps not yet cache-invalidated/re-indexed/refreshed)Immediately afterwards, when the second webhook-initiated
check
queries the GraphQL API, the missing commit may be backfilled into PR-1, but the timestamp check on L38 will exclude it.......I'm really scratching my head as to what is going on.
I'll continue investigation, with a forked version of this resource with some debugging to get more visibility into what is actually being retrieved from the API.
The text was updated successfully, but these errors were encountered: