Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistence Agent keep trying to delete workflows that do not exist #4484

Closed
Bobgy opened this issue Sep 11, 2020 · 0 comments · Fixed by #4486
Closed

Persistence Agent keep trying to delete workflows that do not exist #4484

Bobgy opened this issue Sep 11, 2020 · 0 comments · Fixed by #4486

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Sep 11, 2020

In #4482, a different problem is that

There are many error messages like

Transient failure while syncing resource (default/train-pipeline-ltd84): CustomError (code: 0): Syncing Workflow (train-pipeline-ltd84): transient failure: CustomError (code: 0): Error while reporting workflow resource (code: Internal, message: Report workflow failed.: InternalServerError: Failed to delete the completed workflow for run 3e649710-8b97-4c64-9f39-a0536d8abf1b: workflows.argoproj.io "train-pipeline-ltd84" not found):

I suspect the workqueue is filled with workflows that have already been deleted, so it cannot move forward with others.

The Fix

KFP API server should mark delete workflow not found errors as permanent, so persistence agent won't need to keep retrying them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant