Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalable scheduler implementation #1918

Closed
sbordet opened this issue Oct 23, 2017 · 5 comments
Closed

Scalable scheduler implementation #1918

sbordet opened this issue Oct 23, 2017 · 5 comments

Comments

@sbordet
Copy link
Contributor

sbordet commented Oct 23, 2017

Under high request rate, the scheduler may be subject to high contention.

Scheduled tasks are added to a data structure that holds the tasks ordered by expiration time, typically a heap data structure (e.g. JDK's ScheduledThreadPoolExecutor).

For tasks with similar expiration time, such as HttpClient requests with a timeout, the tasks have a very similar expiration time, possibly forcing a deep walking of the heap data structure.

We need a more scalable scheduler.

@sbordet
Copy link
Contributor Author

sbordet commented Oct 23, 2017

One solution would be to use multiple schedulers and then round-robin the task submissions.

@sbordet
Copy link
Contributor Author

sbordet commented Oct 23, 2017

Another option would be to use a hashed wheel as described in this paper.

@gregw
Copy link
Contributor

gregw commented Oct 24, 2017

I'd prefer to look for ways to avoid load on the scheduler before considering re implementing what should be JVM provided services with better algorithms

Perhaps the client can better apply techniques like our own IdleTimeout mechanism, which are based on the assumption that most timeouts are for similar expiry times which never actually expire but instead are cancelled and/or reset to a future time.

The IdleTimeout approach allows a connection to make thousands (maybe 10s or 100s of thousands) of timed operations between any actual access to the shared scheduler mechanism.

This mechanism should already be used by the core HttpClient connection for idle timeouts, so perhaps it should also be used for the connection pool, which appears to currently be setting and cancelling timeouts on release/remove events, which may occur many thousands of times per second for a given connector.

@gregw
Copy link
Contributor

gregw commented Oct 26, 2017

@sbordet I've taken the core algorithm from IdleTimeout and created a new CyclicTimeoutTask from it that has a more familiar schedule/cancel style interface.

The class uses an AtomicLong to hold an expiry time, so that schedule, reschedule and cancel operations are typically just a compareandset operation.

The actual timeout handling is done by a loosely coherent chain of Scheduler.Tasks rooted in an AtomicReference<>. The first time a timeout is scheduled, the AtomicLong is set with the expiry time and a Scheduler.Task (wrapped in a Schedule object) is obtained from a call to Scheduler.schedule.

The normal operation is that the CyclicTimeoutTask will be cancelled, in which case just the AtomicLong expiry time is updated and the scheduled task is left as-is. As timeouts are often 30s, but the timer may be used 100s or 1000s of times per second, it is expected that many many schedule/cancel cycles will proceed with just doing the compare/set on the AtomicLong expiry.

Eventually the scheduled task will expire, in which case it will check the AtomicLong expiry. In most cases the expiry will have been scheduled or rescheduled very recently and the should not expire, so a new scheduled task is created for the time remaining until the next expiry. If the timeout was originally 30s and there is a schedule/cancel every second, then the timeout task will next be scheduled for 29s in the future.

If the expiry time is actually exceeded, then a compareAndSet operation is used to atomically "take" the expiry and ensure the callback is called only once.

If a cancel/schedule or a reschedule call arrives with a shorter time period, then a new shorter scheduled task may be created and chained with any existing. This chain is only roughly coherent as the only danger of multiple access is that we might set an extra schedule task and thus do an extra check for expiry.

gregw added a commit that referenced this issue Nov 2, 2017
Added CyclicTimeoutTask class, which uses the same algorithm of t he existing IdleTimeout
class, but reimplemented in a more familiar Scheduler style interface.

Signed-off-by: Greg Wilkins <gregw@webtide.com>
@gregw
Copy link
Contributor

gregw commented Nov 25, 2017

@sbordet you were right. There is a problem with the dual Atomic's! It's not that we lose a Schedule, but that if ones falls off the linked list we can start creating additional Schedules, which case other Schedules to fall off the linked list, etc. and soon we have a cascade of unnecessary Schedules that can grow. I saw this as soon as I simplified the mechanism by removing the chain. I think having the chain reduces the possibility of such a cascade, but I can't yet convince myself that is a 0% probability. So I will probably convert to a lock over 2 fields.

sbordet added a commit that referenced this issue Dec 19, 2017
Updated the implementation to timeout at the channel level, rather
than at the connection level, so that it will work for all transports.

Undeprecated IdleTimeout, as it still has a lot of usages.

Moved HttpClientTimeoutTest from jetty-client to test-http-client-transports.

Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
sbordet pushed a commit that referenced this issue Dec 19, 2017
Added CyclicTimeoutTask class, which uses the same algorithm of t he existing IdleTimeout
class, but reimplemented in a more familiar Scheduler style interface.

Signed-off-by: Greg Wilkins <gregw@webtide.com>
sbordet added a commit that referenced this issue Dec 19, 2017
Updated the implementation to timeout at the channel level, rather
than at the connection level, so that it will work for all transports.

Undeprecated IdleTimeout, as it still has a lot of usages.

Moved HttpClientTimeoutTest from jetty-client to test-http-client-transports.

Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
sbordet pushed a commit that referenced this issue Dec 19, 2017
Used the CyclicTimeoutTask class in the clients HttpDestination and HttpConnection
as an alternative to direct scheduler usage.
Avoid race between compare and set by allowing reschedule to always set the timer

Signed-off-by: Greg Wilkins <gregw@webtide.com>
sbordet added a commit that referenced this issue Dec 19, 2017
Updated the implementation to timeout at the channel level, rather
than at the connection level, so that it will work for all transports.

Undeprecated IdleTimeout, as it still has a lot of usages.

Moved HttpClientTimeoutTest from jetty-client to test-http-client-transports.

Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
gregw added a commit that referenced this issue Dec 20, 2017
Signed-off-by: Greg Wilkins <gregw@webtide.com>
sbordet pushed a commit that referenced this issue Jan 4, 2018
Signed-off-by: Greg Wilkins <gregw@webtide.com>
Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
gregw added a commit that referenced this issue Jan 8, 2018
Signed-off-by: Greg Wilkins <gregw@webtide.com>
Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
gregw added a commit that referenced this issue Jan 10, 2018
* Scalable scheduler changes for #1918
* Added HttpChannel.destroy to destroy CyclicTimer
* fixed rebase with HttpConnectionOverFCGI
* renamed to acquire
* Destroying the HttpChannel consistently in all transports.
* updated headers
* cleanup after final review

Signed-off-by: Greg Wilkins <gregw@webtide.com>
Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
@gregw gregw closed this as completed Jan 10, 2018
sbordet added a commit that referenced this issue Jan 10, 2018
Fixed destroy of HttpChannel for HTTP/1.1: not at release()
because the connection and therefore the channel will be reused,
but at close(), when we're sure the connection will not be reused.

Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants