Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Envoy's Dispatcher abstraction #14401

Closed
akonradi opened this issue Dec 14, 2020 · 16 comments
Closed

Refactor Envoy's Dispatcher abstraction #14401

akonradi opened this issue Dec 14, 2020 · 16 comments
Assignees
Labels
enhancement Feature requests. Not bugs or questions. stale stalebot believes this issue/PR has not been touched recently

Comments

@akonradi
Copy link
Contributor

akonradi commented Dec 14, 2020

This FR was initiated by a discussion on moving scaled timer creation into the Dispatcher interface (#13800 (comment)). I've been working on exactly that, but was having some trouble with testing that scaled timer creation was occurring without duplicating the tests for the ScaledRangeTimerManager. The problem is that ScaledRangeTimerManager is built on top of the existing Dispatcher interface. In the name of efficiency, we could merge the impl into the already-bloated dispatcher (getting rid of virtual function calls and eliminating layering concerns), but that seems counterproductive.

Instead, I'm proposing that the existing Dispatcher be split into a lower-level class that handles files, sockets, timers, and other things that deal directly with events, and an upper-level class that wraps the lower to provide nice abstractions like scaled timers, the DNS resolver, stats, and whatever else is built on top of the low-level abstractions. Implementing the high-level stuff purely in terms of the low-level abstractions makes testing easier (since we can inject mocks) and reduces the amount of code that deals directly with libevent.

@mattklein123 @htuch WDYT?

@akonradi akonradi added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Dec 14, 2020
@akonradi akonradi changed the title Refactor Envoy worker abstraction Refactor Envoy's Dispatcher abstraction Dec 14, 2020
@antoniovicente
Copy link
Contributor

antoniovicente commented Dec 14, 2020

Consider using the Scheduler interface in include/envoy/event/timer.h to create low-level timers and thus resolve the circular dependency on dispatcher.

Other methods you're using from the dispatcher interface:
approximateMonotonicTime()
isThreadSafe()

Both of these could be added to the Scheduler interface.

@antoniovicente antoniovicente removed the triage Issue requires triage label Dec 14, 2020
@akonradi
Copy link
Contributor Author

Thanks, that'll solve my scaled timer issue. I'm still wondering if it's worth splitting up the Dispatcher, though, since it seems to export abstractions that could be built entirely using some of the lower-level abstractions it also exports.

@antoniovicente
Copy link
Contributor

Possibly, but we would need to make sure to limit use of the lower level interface. Most code should interact with the full dispatcher interface.

@akonradi
Copy link
Contributor Author

Actually, the Scheduler interface doesn't resolve the circular dependency since Scheduler::createTimer's second argument is a Dispatcher& 🤦

@akonradi
Copy link
Contributor Author

...though it looks like that might be easy enough to remove since the scheduler comes from the dispatcher.

@antoniovicente
Copy link
Contributor

Could be resolved by adding setTrackedObject to the scheduler interface, or some new interface related to tracking current state in the event of a crash.

@antoniovicente antoniovicente self-assigned this Dec 15, 2020
@htuch
Copy link
Member

htuch commented Dec 16, 2020

Seems reasonable, but this could also be an opportunity to move some things out of dispatcher that might be concept/scope creep. For example, I'm not sure if the DNS resolver really needs to live there. Ultimately some of the API choices may have been influenced by the use of libevent APIs in Envoy's early history, e.g. socket listeners, and could be better factored based on the current architecture.

@mattklein123
Copy link
Member

Ultimately some of the API choices may have been influenced by the use of libevent APIs in Envoy's early history, e.g. socket listeners, and could be better factored based on the current architecture.

I think this is accurate. In general I think the idea of a low level interface that is very thin (and for example might allow us to completely swap out libevent for libuv/libev/etc. more easily) and then a high layer interface on top makes sense to me. In practice I don't have a good feel for how difficult that would be, but no major objection from me.

@antoniovicente
Copy link
Contributor

I'm drafting a change right now to see what it would look like. Basically, remove the Dispatcher argument from the Schedulable::createTimer API, move a few useful methods to Scheduler and making Dispatcher a subclass of Scheduler.

Methods to move to Scheduler:
approximateMonotonicTime()
isThreadSafe()
setTrackedObject()

We could consider moving fileevent and friends to Scheduler in the future.

@antoniovicente
Copy link
Contributor

I have two possible drafts:
master...antoniovicente:dispatcher_base
master...antoniovicente:dispatcher_interface

My thoughts on antoniovicente:dispatcher_base:

  1. The change is simple and minimally intrusive. Does it address the basic need?

My thoughts so far on antoniovicente:dispatcher_interface:

  1. Moving just timers doesn't feel like the right separation point. We need to move post callbacks also in order to properly encapsulate run tid which we ASSERT when running post callbacks. Low level API should include at least timers, fd events, schedulable callbacs, post, deferred delete and watchdog integration.
  2. Extending the Scheduler interface doesn't seem to be the right thing since we end up with a bunch of interface methods that are effectively unreachable. If we have LibeventScheduler implement the low level interface, we should move the Timer wrapper integration used by SimulatedTime to LibeventScheduler.
  3. We could consider creation of the LibeventScheduler via a factory or pass it in as a dispatcher constructor argument in order to make the low-level event loop more pluggable, but there will be some slight performance consequences due to having to do a few more virtual calls inside the dispatcher. But being able to more easily plug in an alternative event loop seems attractive enough to justify the costs. That said, I'm fairly sure that meaningful integration with libuv may require deeper API changes. I think that if want to continue using FileEvents we would be limited to only using the epoll engine within libuv.

@akonradi
Copy link
Contributor Author

dispatcher_base addresses the basic need, which is separating out low-level stuff that the dispatcher implements from higher-level stuff built on top, yeah.

@antoniovicente
Copy link
Contributor

antoniovicente commented Dec 16, 2020

Do remember that higher level constructs like connections depend on ScaledTimer and are also created through the dispatcher. ScaledTimers are relatively low level.

@akonradi
Copy link
Contributor Author

Sounds like we need a DAG :D

Fair. Here's a crazy idea: get rid of createScaledTimer and just add an optional ScaledTimerMinimum argument to createTimer. If it's ScaledMinimum(1), return a regular timer; otherwise return a scaled timer.

(I don't recommend this because scaled timers are actually not that efficient unless there are very few distinct timeouts)

@antoniovicente
Copy link
Contributor

Sounds like we need a DAG :D

Fair. Here's a crazy idea: get rid of createScaledTimer and just add an optional ScaledTimerMinimum argument to createTimer. If it's ScaledMinimum(1), return a regular timer; otherwise return a scaled timer.

I think this would create a circular dependency between ScaledTimer and the createTimer method that may itself create a scaled timer.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Jan 15, 2021
@github-actions
Copy link

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests. Not bugs or questions. stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

No branches or pull requests

4 participants