Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark efficient watch resumption KEP as implementable #1922

Merged

Conversation

wojtek-t
Copy link
Member

@wojtek-t wojtek-t commented Aug 4, 2020

Ref #1904

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 4, 2020
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. labels Aug 4, 2020
@wojtek-t
Copy link
Member Author

@lavalamp @deads2k - PTAL

TODO: Fill in before making `Implementable`.
- [Alpha] unit tests for logic enhancing resource version tracking in reflector
- [Alpha] unit tests for newly added watch cache logic
- [Beta] Integration/e2e test for sending bookmark on kube-apiserver shutdown
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just integration is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

- [Alpha] unit tests for logic enhancing resource version tracking in reflector
- [Alpha] unit tests for newly added watch cache logic
- [Beta] Integration/e2e test for sending bookmark on kube-apiserver shutdown
- [Beta] Integration/e2e test for proving that resource version that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration sounds fine.

I don't think you need to e2e test this feature, I think integration tests will be easier to make exhaustive.

If you want an e2e test, I'd actually consider something that analyzes e.g. the audit log to count relists per watcher.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I kind of expect these integration tests for alpha?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I agree that integration seems good enough.

Also switched to be required in alpha.


#### Alpha -> Beta Graduation

TODO: Fill in before making `Implementable`.
- Tests marked as Beta implemented and stable for at least 2 weeks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd want this for alpha, personally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


#### Beta -> GA Graduation

TODO: Fill in before making `Implementable`.
- Enabled in Beta for at least two releases without complaints
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need one release in beta in case there's a major problem. This isn't really adding api surface area so I don't know that a long beta period is necessary or useful.

I'd make the tests exhaustive instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you suggest to decrease it to 1 release?

Given it takes time until people start using a given release in production (or until cloud providers enable a version), I think having beta for two versions seems safer to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the feature, I'll be surprised if it passes all the tests and people still find practical problems with it. I don't mind two releases, I'm just thinking maybe give ourselves the option to just go to GA after one if everything is going great?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beta is enabled by default so it probably doesn't change that much. I think 2 releases are a bit safer, but I don't mind single release too (we can probably adjust that before going Beta too).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on the number of betas.

@lavalamp
Copy link
Member

I think some test should verify that in a split-etcd backend mode (e.g., running events on a different etcd cluster), no RVs leak between resource types.

@wojtek-t wojtek-t force-pushed the efficient_watch_reboot_implementable branch from df3608d to 56a0275 Compare August 11, 2020 09:38
@wojtek-t
Copy link
Member Author

I think some test should verify that in a split-etcd backend mode (e.g., running events on a different etcd cluster), no RVs leak between resource types.

Added explicitly to one of the tests in Alpha.

@lavalamp - comments addressed - PTAL

@lavalamp
Copy link
Member

This LGTM, I'll leave it open a bit for @deads2k to glance at it.


#### Alpha -> Beta Graduation

TODO: Fill in before making `Implementable`.
- Ad-hoc manual rolling-upgrade of kube-apiservers in 5k-node cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like metrics before going to beta.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added above

@wojtek-t wojtek-t force-pushed the efficient_watch_reboot_implementable branch from 56a0275 to 83ae0bd Compare August 17, 2020 15:24
@deads2k
Copy link
Contributor

deads2k commented Aug 17, 2020

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 17, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 17, 2020
@k8s-ci-robot k8s-ci-robot merged commit dd5d81a into kubernetes:master Aug 17, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants