Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(backend): persistence agent - workflow not found error should be a permanent error #4486

Merged
merged 6 commits into from
Sep 11, 2020

Conversation

Bobgy
Copy link
Contributor

@Bobgy Bobgy commented Sep 11, 2020

Description of your changes:
Fixes #4484

KFP API server should mark delete workflow not found errors as permanent, so persistence agent won't need to keep retrying them and can just drop outdated items from the queue.

The error happens only when persistence worker cannot update its work queue. The only reported case was caused by too many workflows in the cluster. The error messages are super long and recurring, so they make it impossible to find if there are other errors that caused persistence agent fail to GC workflows.

/area backend

@Bobgy Bobgy added the cherrypick-approved area OWNER approves to cherry pick this PR to current active release branch label Sep 11, 2020
@kubeflow-bot
Copy link

This change is Reviewable

@Bobgy
Copy link
Contributor Author

Bobgy commented Sep 11, 2020

/assign @rmgogogo @IronPan

@IronPan
Copy link
Member

IronPan commented Sep 11, 2020

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: IronPan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 29a6aaa into kubeflow:master Sep 11, 2020
@Bobgy Bobgy added the cherrypicked cherry picked to release branch `release-x.y` label Sep 14, 2020
@Bobgy Bobgy deleted the be_wf_not_found branch September 14, 2020 02:01
Bobgy added a commit that referenced this pull request Sep 14, 2020
…a permanent error (#4486)

* fix(backend): workflow not found error should be permanent

* failing test case

* Fix logic

* fix another case

* Switched to not found error

* not found error should be permanent
Jeffwan pushed a commit to Jeffwan/pipelines that referenced this pull request Dec 9, 2020
…a permanent error (kubeflow#4486)

* fix(backend): workflow not found error should be permanent

* failing test case

* Fix logic

* fix another case

* Switched to not found error

* not found error should be permanent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/backend cherrypick-approved area OWNER approves to cherry pick this PR to current active release branch cherrypicked cherry picked to release branch `release-x.y` cla: yes kind/bug lgtm size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Persistence Agent keep trying to delete workflows that do not exist
5 participants