Scalable scheduler implementation #1918

sbordet · 2017-10-23T21:25:28Z

Under high request rate, the scheduler may be subject to high contention.

Scheduled tasks are added to a data structure that holds the tasks ordered by expiration time, typically a heap data structure (e.g. JDK's ScheduledThreadPoolExecutor).

For tasks with similar expiration time, such as HttpClient requests with a timeout, the tasks have a very similar expiration time, possibly forcing a deep walking of the heap data structure.

We need a more scalable scheduler.

The text was updated successfully, but these errors were encountered:

sbordet · 2017-10-23T21:26:21Z

One solution would be to use multiple schedulers and then round-robin the task submissions.

sbordet · 2017-10-23T21:27:04Z

Another option would be to use a hashed wheel as described in this paper.

gregw · 2017-10-24T02:56:30Z

I'd prefer to look for ways to avoid load on the scheduler before considering re implementing what should be JVM provided services with better algorithms

Perhaps the client can better apply techniques like our own IdleTimeout mechanism, which are based on the assumption that most timeouts are for similar expiry times which never actually expire but instead are cancelled and/or reset to a future time.

The IdleTimeout approach allows a connection to make thousands (maybe 10s or 100s of thousands) of timed operations between any actual access to the shared scheduler mechanism.

This mechanism should already be used by the core HttpClient connection for idle timeouts, so perhaps it should also be used for the connection pool, which appears to currently be setting and cancelling timeouts on release/remove events, which may occur many thousands of times per second for a given connector.

gregw · 2017-10-26T05:54:57Z

@sbordet I've taken the core algorithm from IdleTimeout and created a new CyclicTimeoutTask from it that has a more familiar schedule/cancel style interface.

The class uses an AtomicLong to hold an expiry time, so that schedule, reschedule and cancel operations are typically just a compareandset operation.

The actual timeout handling is done by a loosely coherent chain of Scheduler.Tasks rooted in an AtomicReference<>. The first time a timeout is scheduled, the AtomicLong is set with the expiry time and a Scheduler.Task (wrapped in a Schedule object) is obtained from a call to Scheduler.schedule.

The normal operation is that the CyclicTimeoutTask will be cancelled, in which case just the AtomicLong expiry time is updated and the scheduled task is left as-is. As timeouts are often 30s, but the timer may be used 100s or 1000s of times per second, it is expected that many many schedule/cancel cycles will proceed with just doing the compare/set on the AtomicLong expiry.

Eventually the scheduled task will expire, in which case it will check the AtomicLong expiry. In most cases the expiry will have been scheduled or rescheduled very recently and the should not expire, so a new scheduled task is created for the time remaining until the next expiry. If the timeout was originally 30s and there is a schedule/cancel every second, then the timeout task will next be scheduled for 29s in the future.

If the expiry time is actually exceeded, then a compareAndSet operation is used to atomically "take" the expiry and ensure the callback is called only once.

If a cancel/schedule or a reschedule call arrives with a shorter time period, then a new shorter scheduled task may be created and chained with any existing. This chain is only roughly coherent as the only danger of multiple access is that we might set an extra schedule task and thus do an extra check for expiry.

Added CyclicTimeoutTask class, which uses the same algorithm of t he existing IdleTimeout class, but reimplemented in a more familiar Scheduler style interface. Signed-off-by: Greg Wilkins <gregw@webtide.com>

gregw · 2017-11-25T07:12:52Z

@sbordet you were right. There is a problem with the dual Atomic's! It's not that we lose a Schedule, but that if ones falls off the linked list we can start creating additional Schedules, which case other Schedules to fall off the linked list, etc. and soon we have a cascade of unnecessary Schedules that can grow. I saw this as soon as I simplified the mechanism by removing the chain. I think having the chain reduces the possibility of such a cascade, but I can't yet convince myself that is a 0% probability. So I will probably convert to a lock over 2 fields.

Updated the implementation to timeout at the channel level, rather than at the connection level, so that it will work for all transports. Undeprecated IdleTimeout, as it still has a lot of usages. Moved HttpClientTimeoutTest from jetty-client to test-http-client-transports. Signed-off-by: Simone Bordet <simone.bordet@gmail.com>

Added CyclicTimeoutTask class, which uses the same algorithm of t he existing IdleTimeout class, but reimplemented in a more familiar Scheduler style interface. Signed-off-by: Greg Wilkins <gregw@webtide.com>

Updated the implementation to timeout at the channel level, rather than at the connection level, so that it will work for all transports. Undeprecated IdleTimeout, as it still has a lot of usages. Moved HttpClientTimeoutTest from jetty-client to test-http-client-transports. Signed-off-by: Simone Bordet <simone.bordet@gmail.com>

Used the CyclicTimeoutTask class in the clients HttpDestination and HttpConnection as an alternative to direct scheduler usage. Avoid race between compare and set by allowing reschedule to always set the timer Signed-off-by: Greg Wilkins <gregw@webtide.com>

Updated the implementation to timeout at the channel level, rather than at the connection level, so that it will work for all transports. Undeprecated IdleTimeout, as it still has a lot of usages. Moved HttpClientTimeoutTest from jetty-client to test-http-client-transports. Signed-off-by: Simone Bordet <simone.bordet@gmail.com>

Signed-off-by: Greg Wilkins <gregw@webtide.com>

Signed-off-by: Greg Wilkins <gregw@webtide.com> Signed-off-by: Simone Bordet <simone.bordet@gmail.com>

* Scalable scheduler changes for #1918 * Added HttpChannel.destroy to destroy CyclicTimer * fixed rebase with HttpConnectionOverFCGI * renamed to acquire * Destroying the HttpChannel consistently in all transports. * updated headers * cleanup after final review Signed-off-by: Greg Wilkins <gregw@webtide.com> Signed-off-by: Simone Bordet <simone.bordet@gmail.com>

Fixed destroy of HttpChannel for HTTP/1.1: not at release() because the connection and therefore the channel will be reused, but at close(), when we're sure the connection will not be reused. Signed-off-by: Simone Bordet <simone.bordet@gmail.com>

gregw added a commit that referenced this issue Oct 26, 2017

Issue #1918 Added CyclicTimeoutTask as a scalable scheduler

019c530

gregw added a commit that referenced this issue Nov 2, 2017

Issue #1918

9aa694f

Added CyclicTimeoutTask class, which uses the same algorithm of t he existing IdleTimeout class, but reimplemented in a more familiar Scheduler style interface. Signed-off-by: Greg Wilkins <gregw@webtide.com>

gregw mentioned this issue Nov 2, 2017

Jetty 9.4.x #1918 scalable scheduler #1938

Closed

gregw mentioned this issue Nov 29, 2017

Replace IdleTimeout with CyclicTimeout from #1918 #2002

Open

sbordet pushed a commit that referenced this issue Dec 19, 2017

Issue #1918

901d69f

Added CyclicTimeoutTask class, which uses the same algorithm of t he existing IdleTimeout class, but reimplemented in a more familiar Scheduler style interface. Signed-off-by: Greg Wilkins <gregw@webtide.com>

gregw added a commit that referenced this issue Dec 20, 2017

Scalable scheduler changes for #1918

cc155eb

Signed-off-by: Greg Wilkins <gregw@webtide.com>

gregw mentioned this issue Dec 20, 2017

Scalable scheduler changes for #1918 #2073

Closed

sbordet mentioned this issue Jan 3, 2018

Recycle HTTP/2 channels on the client #2088

Closed

sbordet pushed a commit that referenced this issue Jan 4, 2018

Scalable scheduler changes for #1918

3e372b1

Signed-off-by: Greg Wilkins <gregw@webtide.com> Signed-off-by: Simone Bordet <simone.bordet@gmail.com>

gregw added a commit that referenced this issue Jan 8, 2018

Scalable scheduler changes for #1918

0726b14

Signed-off-by: Greg Wilkins <gregw@webtide.com> Signed-off-by: Simone Bordet <simone.bordet@gmail.com>

gregw mentioned this issue Jan 8, 2018

Jetty 9.4.x #1918 scalable scheduler3 #2101

Merged

gregw closed this as completed Jan 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalable scheduler implementation #1918

Scalable scheduler implementation #1918

sbordet commented Oct 23, 2017

sbordet commented Oct 23, 2017

sbordet commented Oct 23, 2017

gregw commented Oct 24, 2017 •

edited

Loading

gregw commented Oct 26, 2017

gregw commented Nov 25, 2017

Scalable scheduler implementation #1918

Scalable scheduler implementation #1918

Comments

sbordet commented Oct 23, 2017

sbordet commented Oct 23, 2017

sbordet commented Oct 23, 2017

gregw commented Oct 24, 2017 • edited Loading

gregw commented Oct 26, 2017

gregw commented Nov 25, 2017

gregw commented Oct 24, 2017 •

edited

Loading