Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Knative eventing more serverless and scalable #2152

Closed
aslom opened this issue Nov 6, 2019 · 19 comments
Closed

Make Knative eventing more serverless and scalable #2152

aslom opened this issue Nov 6, 2019 · 19 comments
Labels
area/performance kind/feature-request lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Milestone

Comments

@aslom
Copy link
Member

aslom commented Nov 6, 2019

Make Knative eventing more serverless and scalable

Problem
A short explanation of the problem, including relevant restrictions.

Knative eventing (sources, channels, brokers, ...) should scale to zero when not used and scale up in proportion to number of events

Persona:
Which persona is this feature for?

System Integrator, Event consumer (developer)

Exit Criteria
A measurable (binary) test that would indicate that the problem has been resolved.

When no events coming into Knative eventing then it should scale down to zero (knative eventing services should take minimal or no memory or cpu, channels should scale down as well). When events are received over HTTP it should scale up.

Time Estimate (optional):
How many developer-days do you think this may take to resolve?

weeks

Additional context (optional)
Add any other context about the feature request here.

Let start discussion about what would be the lowest hanging fruits?

@aslom
Copy link
Member Author

aslom commented Nov 6, 2019

Improvements should help with related issue about Broker internal design improvements #1555

@aslom
Copy link
Member Author

aslom commented Nov 6, 2019

There may be other related issue such as about making control-plane scalable as well?

@aslom
Copy link
Member Author

aslom commented Nov 6, 2019

I would like to find who is interested in this topic: @nachocano @slinkydeveloper @Harwayne @matzew @lionelvillard ?

@aslom
Copy link
Member Author

aslom commented Nov 6, 2019

What would be the best way to capture possible options short term (MVP, before 1.0?) and long term ideal solution?

I am thinking to start with design docs for each sub-topic such as sources and channels?
Using design doc using template?
https://github.com/knative/community/blob/master/CONTRIBUTING.md#design-documents
https://docs.google.com/document/d/1QRREfL8gSVSURHMkDLjDtMh1BBsNiwfmaBiGnJUDQ2g/edit

@mikehelmick
Copy link
Contributor

/cc @grantr

@nachocano
Copy link
Contributor

nachocano commented Nov 6, 2019 via email

@grantr
Copy link
Contributor

grantr commented Nov 6, 2019

/area performance

@slinkydeveloper
Copy link
Contributor

slinkydeveloper commented Nov 6, 2019

I think it's quite interesting the idea to scale up, but I'm not sure it would be simple to go to 0.
Are we sure that we can consider every knative component stateless, so it can be scaled down to 0 without data loss?
+1 for the document

@aslom
Copy link
Member Author

aslom commented Nov 6, 2019

@slinkydeveloper hopefully we can scale control-plane to zero - only spin it up when CR is created/modified/deleted? For data plan may be more complex as some sources requires pull or persistent polling connection so something needs ot keep running even if there is no events recevied?

@slinkydeveloper
Copy link
Contributor

There is any way atm to let k8s spin up control-plane only when CR changes happens?

@antoineco
Copy link
Contributor

@slinkydeveloper that would require one component to always be up and running to watch for changes to relevant resources and then start the corresponding controller. Kubernetes does not actively notify controllers when changes occur (push), the controllers maintain a long connection to receive events about resources they are interested in (pull).

@slinkydeveloper
Copy link
Contributor

@antoineco If that's the case, I'm kinda worried this kind of solution only makes the architecture more complex without giving relevant benefits. If we consider only eventing, it's quite easy, but eventing itself is somewhat "extensible" adding eventing-contrib components, so how do we manage all components with same "listener & autoscaler" stuff? If we end up with multiple "listener & autoscaler", it doesn't make any sense 😄
BTW for data-plane I really agree we should autoscale up and down (not necessarely to 0)

@antoineco
Copy link
Contributor

Fully agree with everything you said, I really don't see any benefit scaling the control plane.

@aslom
Copy link
Member Author

aslom commented Nov 7, 2019

@antoineco @slinkydeveloper that starts to matter if you have very small (and want scaling to zero to conserve resources) and very large knative cluster (lot of activity in control plane) and may ot of controllers installed for different source and channel types?

Also that may be bigger issue for multi-tenant cluster as you would want to run all supported types of sources and channels so tenants can easily create them?

I am going ot open separate issue to discuss the control plane - even if we decide there is no need to improve it now we will have record of it?

@antoineco
Copy link
Contributor

antoineco commented Nov 8, 2019

Moved my comment to #2161 (comment)

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 25, 2020
@aslom
Copy link
Member Author

aslom commented Dec 6, 2020

/reopen

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 23, 2021
@slinkydeveloper
Copy link
Contributor

Is this still relevant? Since I see you worked on this in more specific issues, can we close this more general issue?

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance kind/feature-request lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

7 participants