Barbara compares some cpp code (and has a performance problem) #144

guswynn · 2021-04-12T16:11:45Z

The technical conclusion of the specific example in this doc is similar to: #132, but the story in general can be about how to compare rust async code and other languages async code (particularly c++20)

The technical conclusion of the specific example in this doc is similar to: rust-lang#132, but the story in general can be about how to compare rust async code and other languages async code (particularly c++20)

farnz · 2021-04-12T16:23:06Z

src/vision/status_quo/barbara_compares_some_cpp_code.md

+impl Client {
+    async fn request(&self) -> Result<Data> {
+        let bytes = self.inner.network_request().await?
+        Ok(serialization_libary::from_bytes(&bytes)?)


This code is still problematic, even with the spawn below, because you've got blocking work on a non-blocking task. Ideally, it would use spawn_blocking to fix this:

Ok(tokio::task::spawn_blocking(|| serialization_libary::from_bytes(&bytes)).await??)

or, if the library was normally fast, but slow on some (big?) inputs you might prefer block_in_place (which ensures that other tasks run) and the spawn solution you gave:

Ok(tokio::task::block_in_place(|| serialization_libary::from_bytes(&bytes))?)

Goes to show how hard this user story is to get right - the fix in this story still doesn't work well.

There's a fun sequence here where you add block_in_place and it's still bad, and then add the spawn, which fixes it.

One that's bitten us in Mononoke (issue #131 here) is that even if the long running work is also nicely async (i.e. just makes another network request), it can still go wrong.

TIL about block_in_place, though I find it hard to reason about.I think in this case, only spawn_blocking would work, but it has its own problems (can't be cancelled), (edit: block_in_place + my spawn below, as you mentioned work as well, but in practice not having the block_in_place has not proven problematic for me, but perhaps I have even more perf improvements lurking) and like you mentioned, the cpu work is only expensive if the result is large, which is a runtime property. Not sure how we could detect that and do the correct thing. async-rs/async-std#631 looks like an attempt, but I need to read more into why it was closed

What does block_in_place do? Isn't each tokio task already resilient to starving the others because they run on independent threads? Or is there some way the work-stealing part of it locks up?

Each worker thread has its own (fixed-size) queue of tasks to service. There's a global queue for overflows, too, and a certain amount of rebalancing done by each worker thread to avoid starving the tasks in the global queue.

If a worker task goes idle, and the global queue is empty, then it will steal work from other workers. block_in_place ensures that the local queue is empty. As a result, it all depends on the balance between thread count (typically one per CPU hart) and runnable task count; if you have (e.g.) a huge AMD EPYC server with 256 harts (two threads and 128 cores in the system), and typically only 100 tasks to run, then you'll never see the difference between spawn_blocking/block_in_place and spawn, because there will be idle workers that steal work from blocked workers.

On the other hand, if you have a small system (say an older laptop) with 4 harts (two cores, two threads per core), and you have still have 100 tasks to run, without block_in_place you might see the runtime fail to migrate some of those tasks off the blocked worker and onto a running worker - the result is starvation for the tasks that don't get migrated to another worker thread.

I'm guessing that you've never see the starvation issue because you normally have lots of worker threads, and not many runnable tasks at a time - as a result, a worker goes idle and steals work from the blocked thread. If you had small numbers of worker threads, but still lots of tasks, you'd see starvation more often.

This sounds like it might make a nice addition to the story. e.g., Barbara is happy for a while, but then she sees starvation or something like that. That said, it could also be a good FAQ. I wouldn't change the main part of the story, though, since invoking tokio::spawn seems to be what "Barbara" actually did here.

Re-reading this comment I was definitely struck by the challenge of "how to jigger this exactly right". I'm curious, @farnz, whether you think that a more adaptive runtime -- similar to what Go does, or what was proposed in this blog post, but never adopted -- would be a better fit for Mononoke?

One of the challenges here seems to be a matter of composition. The "right place" to insert block_in_place, for example, is inside of reqwest, at least if I understand correctly, but it may not be well positioned to judge whether messages are likely to be large or not.

Oh I guess I was wrong about that last bit. Still, it seems like these kinds of issues could frequently cross library boundaries.

Oh also I see you referenced async-rs/async-std#631 in the FAQ :)

I'm not convinced by more adaptive runtimes - they tend to result in developers not realising when they've introduced accidental blocking, and I have a bias towards engineers knowing the cost of their decisions.

There are two orthogonal problems that I perceive here:

Two different ways to flag a section as blocking - block_in_place and spawn_blocking - with room for semantic confusion. In my ideal world, there would only be one - block_in_place - and something suitable done in the back-end so that it interacts nicely with spawn, such that spawn(async { block_in_place(|| thing()) }) is exactly the same as (or better than) spawn_blocking(|| thing()) today. That way, the guidance would be block_in_place for things that might block the CPU, and spawn to get parallelism.

Lots of different ways to end up accidentally starving a task that's ready to run - both long polls, and failure to poll a sub-scheduler in a timely fashion. If I could wave a wand, I'd have async task tracing that tells me which tasks are taking too long to poll, and what they're doing when they hog the worker thread; this would also catch the issue where a sub-scheduler doesn't poll, because I'd see the tracing showing the first poll in the sub-scheduler, and then a long gap.

For point 2 there, tracing-futures provides the Instrumented wrapper which allows me to build a chunk of the tracing I'd want. Having the runtime automatically instrument everything so that I can see exactly when a future is polled (or not polled) and how long each poll takes would be better for my needs than an adaptive runtime; I could use tracing-flame to both identify blocking (long polls), and starvation (long gaps between a future being ready to wake and being polled), and fix up both.

The missing bit in tracing-futures is identifying when a Future is ready to be polled again - I'd love to see some sort of event tied to the future's span, so that I can see that (e.g.) it was ready to be polled again, but not polled for 10 seconds, or that it got polled again 2 µs after the waker was notified. I'd want to skip the first poll for this (the gap between creation and first poll) because that's a delay that I believe most people would predict in code review.

Thanks @farnz. Helpful.

nikomatsakis

This is great, thanks. I think it'd be good to add something similar to what @farnz suggested, or perhaps @farnz might have a more concrete proposal.

farnz

These are just suggestions - I'm bringing in two other archetypes because my experience of dealing with these problems with my peers is that you end up (a) making the same mistake Barbara did at first, and (b) finding it hard to explain your decisions to your team.

My goals are twofold: first, to emphasise that Barbara is doing the right thing when she hits the problem - it's not something that "should" have been obvious as she was writing the call to serialization_libary::from_bytes - and secondly to make it clear that the obvious fix of spawn is incomplete, and that another fix is also needed.

src/vision/status_quo/barbara_compares_some_cpp_code.md

nikomatsakis

This looks about ready to land to me. I left some small requests.

src/vision/status_quo/barbara_compares_some_cpp_code.md

nikomatsakis · 2021-04-14T10:22:53Z

src/vision/status_quo/barbara_compares_some_cpp_code.md

+impl Client {
+    async fn request(&self) -> Result<Data> {
+        let bytes = self.inner.network_request().await?
+        Ok(serialization_libary::from_bytes(&bytes)?)


Oh also I see you referenced async-rs/async-std#631 in the FAQ :)

src/vision/status_quo/barbara_compares_some_cpp_code.md

Co-authored-by: Alice Ryhl <alice@ryhl.io>

Co-authored-by: Ryan Levick <rylev@users.noreply.github.com>

@nikomatsakis

@nikomatsakis I think this covers all the comments?

guswynn · 2021-04-15T16:31:06Z

@nikomatsakis I think this covers all the comments?

nikomatsakis

I'm ready to merge, yes! This is great, thanks @guswynn. I just re-read and I feel like I got a deeper understanding.

Create barbara_compares_some_cpp_code.md

7906f87

The technical conclusion of the specific example in this doc is similar to: rust-lang#132, but the story in general can be about how to compare rust async code and other languages async code (particularly c++20)

guswynn changed the title ~~Create barbara_compares_some_cpp_code.md~~ Barbara compares some cpp code (and has a performance problem) Apr 12, 2021

farnz reviewed Apr 12, 2021

View reviewed changes

nikomatsakis requested changes Apr 12, 2021

View reviewed changes

farnz reviewed Apr 12, 2021

View reviewed changes

src/vision/status_quo/barbara_compares_some_cpp_code.md Show resolved Hide resolved

src/vision/status_quo/barbara_compares_some_cpp_code.md Outdated Show resolved Hide resolved

src/vision/status_quo/barbara_compares_some_cpp_code.md Show resolved Hide resolved

add FAQ about block_in_place

16b484a

farnz reviewed Apr 12, 2021

View reviewed changes

src/vision/status_quo/barbara_compares_some_cpp_code.md Outdated Show resolved Hide resolved

Update barbara_compares_some_cpp_code.md

a040e02

rylev reviewed Apr 14, 2021

View reviewed changes

src/vision/status_quo/barbara_compares_some_cpp_code.md Outdated Show resolved Hide resolved

src/vision/status_quo/barbara_compares_some_cpp_code.md Outdated Show resolved Hide resolved

nikomatsakis requested changes Apr 14, 2021

View reviewed changes

Darksonn reviewed Apr 14, 2021

View reviewed changes

src/vision/status_quo/barbara_compares_some_cpp_code.md Outdated Show resolved Hide resolved

guswynn and others added 3 commits April 14, 2021 10:27

Update src/vision/status_quo/barbara_compares_some_cpp_code.md

23085f8

Co-authored-by: Alice Ryhl <alice@ryhl.io>

Update src/vision/status_quo/barbara_compares_some_cpp_code.md

92cb3bd

Co-authored-by: Ryan Levick <rylev@users.noreply.github.com>

Update barbara_compares_some_cpp_code.md

34d0c75

@nikomatsakis I think this covers all the comments?

nikomatsakis approved these changes Apr 18, 2021

View reviewed changes

nikomatsakis merged commit 0e2561e into rust-lang:master Apr 18, 2021

nikomatsakis added the status-quo-story-ideas "Status quo" user story ideas label Apr 18, 2021

guswynn mentioned this pull request May 16, 2022

persist: signal to tokio that we're entering a blocking sqlite call MaterializeInc/materialize#12460

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Barbara compares some cpp code (and has a performance problem) #144

Barbara compares some cpp code (and has a performance problem) #144

guswynn commented Apr 12, 2021

farnz Apr 12, 2021 •

edited

Loading

guswynn Apr 12, 2021 •

edited

Loading

guswynn Apr 12, 2021

farnz Apr 12, 2021

nikomatsakis Apr 12, 2021

nikomatsakis Apr 14, 2021

nikomatsakis Apr 14, 2021

nikomatsakis Apr 14, 2021

farnz Apr 14, 2021

nikomatsakis Apr 14, 2021

nikomatsakis left a comment

farnz left a comment

nikomatsakis left a comment

nikomatsakis Apr 14, 2021

guswynn commented Apr 15, 2021

nikomatsakis left a comment

Barbara compares some cpp code (and has a performance problem) #144

Barbara compares some cpp code (and has a performance problem) #144

Conversation

guswynn commented Apr 12, 2021

farnz Apr 12, 2021 • edited Loading

Choose a reason for hiding this comment

guswynn Apr 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis left a comment

Choose a reason for hiding this comment

farnz left a comment

Choose a reason for hiding this comment

nikomatsakis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guswynn commented Apr 15, 2021

nikomatsakis left a comment

Choose a reason for hiding this comment

farnz Apr 12, 2021 •

edited

Loading

guswynn Apr 12, 2021 •

edited

Loading