Integrated timers #3219

djspiewak · 2022-10-30T14:36:04Z

Dusted off @vasilmkd's old branch (first PRed in #2252) and merged it with the latest head. Still needs some work:

The fallback sleep implementation in WorkStealingThreadPool references IORuntime.global just to get things to compile. This doesn't strike me as strictly necessary given the way that sleepInternal works, but trivially removing it got me into trouble.
The sleep cancelation action appears to just be a lazySet(false) on SleepCallback, which is concerning because it leaks memory. I believe this is why the fiber dump specs are failing.

The runtime has changed meaningfully since Vasil's original implementation, so this definitely needs a bit of careful thought to ensure that we aren't doing anything weird or conflicty. Notably, this implementation simply doesn't implement timer stealing at all, and instead all timers are held local to their scheduling thread. We're going to need to work to validate the hypothesis that we can get away with this. If we need to implement theft, things become a lot more complex.

This is the first step in moving us toward a fully integrated runtime.

armanbilge · 2022-10-30T16:24:52Z

The sleep cancelation action appears to just be a lazySet(false) on SleepCallback, which is concerning because it leaks memory. I believe this is why the fiber dump specs are failing.

Btw, I have an alternative SleepCallback implementation in the native implementation. Might be interesting if we can unify those.

cats-effect/core/native/src/main/scala/cats/effect/unsafe/PollingExecutorScheduler.scala

Lines 139 to 152 in 73d63c8

    
           private[this] final class SleepTask( 
        
               val at: Long, 
        
               val runnable: Runnable 
        
           ) extends Runnable 
        
               with Comparable[SleepTask] { 
        
             def run(): Unit = { 
        
               sleepQueue.remove(this) 
        
               () 
        
             } 
        
             def compareTo(that: SleepTask): Int = 
        
               java.lang.Long.compare(this.at, that.at) 
        
           }

djspiewak · 2022-11-20T22:43:15Z

Published as 3.5-639ac01

djspiewak · 2022-11-20T22:57:06Z

Ran a quick test on Ember. With this change, peak RPS on a basic GET request improved by a little over 25%.

armanbilge · 2022-11-20T22:57:49Z

Um ... holy smokes!

djspiewak · 2022-11-20T23:05:02Z

Did a little more testing. A slightly less trivial test involving a POST body and some JSON parsing (using Circe) shows peak RPS improvements around 14%, which makes intuitive sense since that test is going to be a little more bounded by the body processing than by the pure connection overhead. Still, 25% improvements in trivial GET peak RPS is pretty damn great.

P99 latencies in both cases were improved by 13.5%. I think this number is probably a bit more trustworthy, and also still very impressive IMO.

djspiewak · 2022-11-21T02:16:36Z

Did a quick run through the WorkStealingBenchmark. It hasn't regressed even a small amount. I suspect this is because the JIT is simply inlining away the sleepers.nonEmpty check (since it's always false in those benchmarks). In a real case with real timers, I would expect some regression in straight-line performance, but obviously any such case would also benefit commensurately from this very change. Either way, I think it's safe to say this is ready for serious review.

We can do a lot better than the implementation as it stands, but that can be incrementally layered onto this once it lands.

djspiewak · 2022-11-21T04:42:42Z

TODO

Figure out why the CI keeps hanging (was able to reproduce locally but not minimize)
Make this work with blocking (and add tests)

He-Pin · 2022-11-21T08:17:35Z

core/jvm/src/main/scala/cats/effect/unsafe/WorkStealingThreadPool.scala

@@ -514,6 +522,52 @@ private[effect] final class WorkStealingThreadPool(
   */
  override def reportFailure(cause: Throwable): Unit = reportFailure0(cause)

+  override def monotonicNanos(): Long = System.nanoTime()
+
+  override def nowMillis(): Long = System.currentTimeMillis()


Can this be plugble ,as what if I want to make use of https://github.com/OpenHFT/Chronicle-Ticker

@He-Pin yes, you can always replace the Scheduler in the IORuntime with your own implementation. See also the Cats Effect test runtime, which supports time mocking.
https://typelevel.org/cats-effect/docs/core/test-runtime#mocking-time

djspiewak · 2022-11-22T01:34:46Z

Released as snapshot 3.5-01f5b3a

djspiewak · 2022-11-26T23:29:14Z

A few simple canaries were promising. At least, nothing exploded. We should look more closely. In the meantime, I think this is ready for more serious review. Note that there are definitely follow-ups we can explore which will absolutely micro-optimize this further. We can do that once this lands in series/3.x, since this is, by itself, already a significant improvement.

durban · 2022-12-29T22:50:22Z

Notably, this implementation simply doesn't implement timer stealing at all, and instead all timers are held local to their scheduling thread.

I think one scenario affected by this is when (1) a compute thread is blocked/spinwaiting on a condition (without using IO.blocking), and (2) for that condition to become true, a timer must fire. In this case, if the timer is coincidentally on the same thread as (1), the result is a deadlock. Of course, the solution is "don't do that". True, (1) should "never" happen, but I'm pretty sure it does. Together with (2)... I'm not sure how common it is...

In any case, this is a possible (if contrived) scenario, from which the current system can recover eventually (the timers are independent), but with this PR it can deadlock. (With timer stealing, I think it would also recover eventually, because the timer in question would get stolen.) I'm not sure how important supporting badly behaved code like this is, I just wanted to mention it.

djspiewak · 2022-12-30T00:37:49Z

@durban So you're thinking about something like this?

val flag = new AtomicBoolean(false)
IO(flag.set(true)).delayBy(2.seconds) &> IO(while (!flag.get()) {})

I agree that, without timer stealing, the above can hang forever if the timer happens to be on the same thread as the while loop. However, just to be clear, if you replicateA_(100) the above, you will create a livelock even on the current version of Cats Effect. Locking up worker threads in this fashion is just… very bad. :-)

durban · 2022-12-30T11:53:43Z

@djspiewak Yeah, something like that. The replicateA_ example is good, because it shows that currently it just seems to work. So yeah, "don't do that" is the solution.

durban · 2023-01-03T22:32:53Z

core/jvm/src/main/scala/cats/effect/unsafe/SleepCallback.scala

+  }
+
+  implicit val sleepCallbackReverseOrdering: Ordering[SleepCallback] =
+    Ordering.fromLessThan(_.triggerTime > _.triggerTime)


Suggested change

Ordering.fromLessThan(_.triggerTime > _.triggerTime)

Ordering.fromLessThan(_.triggerTime - _.triggerTime > 0)

durban · 2023-01-03T22:37:52Z

core/jvm/src/main/scala/cats/effect/unsafe/WorkerThread.scala

+        while (cont) {
+          val head = sleepers.head()
+
+          if (head.triggerTime <= now) {


Suggested change

if (head.triggerTime <= now) {

if (head.triggerTime - now <= 0) {

armanbilge · 2023-01-03T23:15:43Z

Btw, I had to make a small patch on my JVM polling branch in e44a802.

armanbilge · 2023-01-03T23:17:44Z

core/jvm/src/main/scala/cats/effect/unsafe/WorkerThread.scala

+      if (!isInterrupted()) {
+        val now = System.nanoTime()
+        val head = sleepersQueue.head()
+        val nanos = head.triggerTime - now


cherry-pick e44a802

Suggested change

val nanos = head.triggerTime - now

val nanos = Math.max(head.triggerTime - now, 0)

He-Pin · 2023-01-12T05:43:43Z

core/shared/src/main/scala/cats/effect/IOFiber.scala

+                if (scheduler.isInstanceOf[WorkStealingThreadPool])
+                  scheduler.asInstanceOf[WorkStealingThreadPool].sleepInternal(delay, cb)
+                else
+                  scheduler.sleep(delay, () => cb(RightUnit))


Actually match is often slower because of how it gets compiled. The JIT fast-paths the sequential combination of a conditional jump branching on isInstanceOf, followed immediately by a dynamic cast, and it turns that into a single operation on most architectures. match is more declarative but can easily generate bytecode which messes up this JIT optimization, so in very hot-path code we tend to be more explicit about it.

He-Pin · 2023-01-12T05:44:53Z

core/jvm/src/main/scala/cats/effect/unsafe/SleepCallback.scala

+
+import java.util.concurrent.atomic.AtomicBoolean
+
+private final class SleepCallback private (


ScheduledCallback?

ScheduledCallback might be a better name here.

vasilmkd and others added 22 commits August 29, 2021 21:09

Implement SleepCallback

a840035

Implement SleepersQueue

91f7dec

Add a cancelation mechanism to SleepCallback

bf7e95d

Complicate the parking mechanism a bit

9d061d5

Pretend to handle sleeping fibers

9d5c01a

Add a sleep method on WorkerThread

09b1073

Implement the sleep method on WorkStealingThreadPool

787173a

Simplify canceled sleepers handling

7bbaf96

Worker threads need to fight for their awakening

96e3baa

Wire up the new sleeping mechanism in the run loop

a0db9bc

Restore the correct thread pool state after sleeping

97ac586

Rename the internal sleep method

a46b1bc

Add a sleep cancelation unit test

fb644ec

Add a link to the openjdk source code

3935273

Add SleepBenchmark

f00c2b4

Check ownership of WorkerThread before sleeping

c13dacc

WorkStealingThreadPool extends Scheduler

2d679c7

Switch the default scheduler to the compute pool

b2808e7

Add a unit test for foreign execution contexts

6431187

Directly complete the sleep async callback

862f024

Re-use the RightUnit instance

5503844

Merge branch 'series/3.x' into feature/integrated-timers

73811e3

djspiewak added this to the v3.5.0 milestone Oct 30, 2022

djspiewak marked this pull request as draft October 30, 2022 14:36

djspiewak added the 🔬 experiment label Nov 12, 2022

djspiewak added 4 commits November 20, 2022 09:49

Merge branch 'series/3.x' into feature/integrated-timers

4bbfdfb

Removed references to external IORuntime from sleep unhappy path

6097579

Fixed memory leak in sleep cancelation

55e5bfa

Fixed reference to global executor

65d5561

djspiewak marked this pull request as ready for review November 21, 2022 02:16

Removed errant println

564ffba

He-Pin reviewed Nov 21, 2022

View reviewed changes

djspiewak added 3 commits November 21, 2022 15:14

Fixed blockOn mechanism for integrated timers

fce4572

Moved WSTP tests into platform spec

3ab4a7c

Removed unused import

01f5b3a

djspiewak mentioned this pull request Nov 24, 2022

Initial sketch of PollingSystem #3278

Closed

djspiewak requested a review from vasilmkd November 26, 2022 23:29

Merge branch 'series/3.x' into feature/integrated-timers

5951e31

armanbilge mentioned this pull request Dec 26, 2022

Polling system #3332

Merged

durban requested changes Jan 3, 2023

View reviewed changes

armanbilge reviewed Jan 3, 2023

View reviewed changes

armanbilge mentioned this pull request Jan 11, 2023

Update the current Scheduler api? #3356

Open

He-Pin reviewed Jan 12, 2023

View reviewed changes

djspiewak merged commit be6325d into typelevel:series/3.x Jan 28, 2023

djspiewak mentioned this pull request Feb 4, 2023

CPU starvation causes problems with scheduling fairness (compared to CE2) #3392

Closed

armanbilge mentioned this pull request Feb 14, 2023

Investigate replacing ScheduledExecutorService with a HashedWheelTimer #1580

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrated timers #3219

Integrated timers #3219

djspiewak commented Oct 30, 2022

armanbilge commented Oct 30, 2022

djspiewak commented Nov 20, 2022

djspiewak commented Nov 20, 2022

armanbilge commented Nov 20, 2022

djspiewak commented Nov 20, 2022

djspiewak commented Nov 21, 2022

djspiewak commented Nov 21, 2022 •

edited

Loading

He-Pin Nov 21, 2022

armanbilge Nov 21, 2022

djspiewak commented Nov 22, 2022

djspiewak commented Nov 26, 2022

durban commented Dec 29, 2022

djspiewak commented Dec 30, 2022

durban commented Dec 30, 2022

durban Jan 3, 2023

durban Jan 3, 2023

armanbilge commented Jan 3, 2023

armanbilge Jan 3, 2023

He-Pin Jan 12, 2023

djspiewak Jan 15, 2023

He-Pin Jan 12, 2023

djspiewak Jan 15, 2023

	Ordering.fromLessThan(_.triggerTime > _.triggerTime)
	Ordering.fromLessThan(_.triggerTime - _.triggerTime > 0)

	if (head.triggerTime <= now) {
	if (head.triggerTime - now <= 0) {

	val nanos = head.triggerTime - now
	val nanos = Math.max(head.triggerTime - now, 0)


		import java.util.concurrent.atomic.AtomicBoolean

		private final class SleepCallback private (

Integrated timers #3219

Integrated timers #3219

Conversation

djspiewak commented Oct 30, 2022

armanbilge commented Oct 30, 2022

djspiewak commented Nov 20, 2022

djspiewak commented Nov 20, 2022

armanbilge commented Nov 20, 2022

djspiewak commented Nov 20, 2022

djspiewak commented Nov 21, 2022

djspiewak commented Nov 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djspiewak commented Nov 22, 2022

djspiewak commented Nov 26, 2022

durban commented Dec 29, 2022

djspiewak commented Dec 30, 2022

durban commented Dec 30, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

armanbilge commented Jan 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djspiewak commented Nov 21, 2022 •

edited

Loading