Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DrCI should never classify failure to run a workflow as flaky #4987

Closed
malfet opened this issue Mar 6, 2024 · 0 comments · Fixed by #4998
Closed

DrCI should never classify failure to run a workflow as flaky #4987

malfet opened this issue Mar 6, 2024 · 0 comments · Fixed by #4998
Assignees

Comments

@malfet
Copy link
Contributor

malfet commented Mar 6, 2024

From pytorch/pytorch#121323 DrCI error message:
image
Which I guess references following failed attempt to run the workflow: https://github.com/pytorch/pytorch/actions/runs/8176971490

Those types of failures should really be merge blocking

@huydhn huydhn self-assigned this Mar 11, 2024
huydhn added a commit that referenced this issue Mar 12, 2024
Fixes #4987

Dr.CI logic to detect `isInfraFlakyJob` and `isLogClassifierFailed` has
a FP where it misclassifies the GH failure to dispatch the whole
workflow as flaky, for example
pytorch/pytorch#121317.

These logic should only be applicable to workflow job, not workflow run.
The way to separate them is to check the `workflowId` field where it is
set to `null` whenever it is a workflow run.

### Testing

Unit test + local curl command will mark them as legit failures:

```
curl --request POST \
--url "http://localhost:3000/api/drci/drci?prNumber=121317" \
--header "Authorization: TOKEN" \
--data 'repo=pytorch'
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants