Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[drci] Workflow file errors remain after they are retried #4969

Closed
clee2000 opened this issue Feb 22, 2024 · 4 comments
Closed

[drci] Workflow file errors remain after they are retried #4969

clee2000 opened this issue Feb 22, 2024 · 4 comments
Assignees

Comments

@clee2000
Copy link
Contributor

Dr CI does not handle workflow file errors very well after they are retried

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120358

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Merge Blocking SEVs

There is 1 active merge blocking SEVs. Please view them below:

If you must merge, use @pytorchbot merge -f.

✅ You can merge normally! (5 Unrelated Failures)

As of commit c83eb30280626ff84d7a2950d200cb3257e16fb2 with merge base 8fa634070189a5567b7bb0ddf1f389d6e43bebe5 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@clee2000 clee2000 changed the title [drci] [drci] Workflow file errors remain after they are retried Feb 22, 2024
@huydhn
Copy link
Contributor

huydhn commented Mar 12, 2024

I think this is the same issue as #4987 and is fixed by #4998

@huydhn
Copy link
Contributor

huydhn commented Mar 12, 2024

AI: We need to handle the case where the failing workflow keep showing up on Dr.CI box even after they have been retried. Otherwise, people will need to force merge to land their changes

AI: There are some errors that are flaky in the view of users, but is too critical for CI to be marked as flaky. If we see an infra failures, it's better to not ignore it and rerun the job instead.

@huydhn huydhn self-assigned this Mar 12, 2024
@clee2000
Copy link
Contributor Author

clee2000 commented Apr 2, 2024

I think this is fixed by #5038

Leaving this open because the second AI item mentioned by Huy in the previous comment is not covered by this and I'm not sure if we still want to do it

@huydhn
Copy link
Contributor

huydhn commented Apr 3, 2024

This issue has been fixed. There are some cases like pytorch/pytorch#123104 where the workflow path still appears in the job name used by mergebot, but it looks like a different issues pytorch/pytorch#122422

@huydhn huydhn closed this as completed Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants