-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache now
in WorkerThread
in states 4-63
#3690
Cache now
in WorkerThread
in states 4-63
#3690
Conversation
case _ => | ||
// Call all of our expired timers: | ||
val now = System.nanoTime() | ||
var cont = true | ||
while (cont) { | ||
val cb = sleepers.pollFirstIfTriggered(now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimization discussed in #3544 (comment) would become even more meaningful after this change, if now
may remain constant for several iterations. So crossing a read-barrier on every iteration is pointless 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I thought about that. It's not horribly complicated to do something like that here, it just introduces a bit more state and a bit more branching. I wanted to start small.
core/jvm/src/main/scala/cats/effect/unsafe/SchedulerCompanionPlatform.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a comment, otherwise looks reasonable. The hard part is (as you say) knowing what is the effect of this in real life...
I used a t2.large instance to measure three scenarios: series/3.4.x, series/3.5.x, and this PR (as of the current HEAD). I ran the series/3.4.xGranularity
Benchmarks
series/3.5.xGranularity
Benchmarks
|
I think there is something wrong with CI: some cirrus jobs didn't get scheduled(?) for more than 12 hours... |
Cirrus CI is using pre-emptible VMs to be cost-efficient. So under demand machines may not be available. |
This is one concern I have with expanding things to fs2, unless it's more of a global Cirrus demand and not github org-specific? |
It's this one I'm afraid. Edit: to clarify, besides global demand there are some other limits for OSS. But in FS2 I have only added four very short-running jobs, it is nowhere comparable to the CE matrix. |
Baseline (@armanbilge's 3.6 pre-release): https://www.techempower.com/benchmarks/#section=test&shareid=59f51434-e97e-4b76-91bb-78ba913a676e&test=plaintext tl;dr 15% more reqs/sec on plaintext, small improvement (1-2%) on JSON. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice stuff here 😃
I still want to see numbers which demonstrate the meaning of these types of things in realistic scenarios, but here's a quick draft that takes inspiration from libuv and minimizes the syscall. In particular, this caches the value of
nanoTime()
for states 4-63 (so, most of the time) and only refreshes it when either stealing, polling the external queue, or when the user evaluatesmonotonic
. This does trade off timer granularity a bit, but it's still strictly better than what we had pre-3.5 (where timers would reenter through the external queue).Fixes #3677