Segfault with rayon & moka #495

tatsuya6502 · 2025-02-04T16:17:53Z

I've observed segfaults while using moka with rayon, and the backtrace looks like this:

https://github.com/polachok/moka-crossbeam-bug - code to reproduce.
I see this happening both on macOS/arm64 & linux/amd64.

#8  0x0000555555599ca4 in crossbeam_epoch::internal::Local::flush (self=0x7fffe8001300, guard=0x7ffff7969fc0) at src/internal.rs:376
#9  crossbeam_epoch::guard::Guard::flush (self=0x7ffff7969fc0) at src/guard.rs:294
#10 0x000055555556a29d in moka::sync_base::base_cache::BaseCache<u64, u64, std::hash::random::RandomState>::do_post_update_steps<u64, u64, std::hash::random::RandomState> (
    self=<optimized out>, ts=..., key=..., old_info=..., upd_op=...) at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/moka-0.12.10/src/sync_base/base_cache.rs:601

CC: @polachok

tatsuya6502 · 2025-02-04T23:46:20Z

Hi. Thanks for reporting the issue. I tried your code briefly before going to work, but I am afraid I could not reproduce it.

It crashes, but for a different reason than you mentioned:

$ cargo run --release
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
zsh: abort      cargo run --release

I gave the names to the Rayon threads:

diff --git a/src/main.rs b/src/main.rs
index 704224a..9cd14fb 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -12,6 +12,7 @@ fn main() {
         .build();
     rayon::ThreadPoolBuilder::new()
         .num_threads(4)
+        .thread_name(|thread_id| format!("rayon-thread-{thread_id}"))
         .exit_handler(|thread_id| {
             println!("Thread '{}' exited", thread_id);
         })

thread 'rayon-thread-3' has overflowed its stack
fatal runtime error: stack overflow

Then I tried increasing the stack size of the threads:

     rayon::ThreadPoolBuilder::new()
         .num_threads(4)
+        .thread_name(|thread_id| format!("rayon-thread-{thread_id}"))
+        .stack_size(6 * 1024 * 1024)
         .exit_handler(|thread_id| {
             println!("Thread '{}' exited", thread_id);
         })

thread 'rayon-thread-0' has overflowed its stack
fatal runtime error: stack overflow

I found that it runs fine when I increase the stack size to 7MB:

     rayon::ThreadPoolBuilder::new()
         .num_threads(4)
+        .thread_name(|thread_id| format!("rayon-thread-{thread_id}"))
+        .stack_size(7 * 1024 * 1024)
         .exit_handler(|thread_id| {
             println!("Thread '{}' exited", thread_id);
         })

$ cargo run --release
...

$ echo $?
0

It seems that the spawned async tasks build up on the Rayon stack quickly(?).

Can you please check if this works for you as well? You might also want to try increasing the stack size of the Rayon threads in your real program?

Environment:

Mac
- macOS Sequoia 15.3
- Apple M2 chip (4 high-performance cores, 4 high-efficiency cores)
Rust 1.84.0
- The host and target: aarch64-apple-darwin

tatsuya6502 · 2025-02-05T13:16:16Z

I ran it on Linux x86_64 and got the same stack overflow. It fixed after increasing the stack size to 6MB.

I think the following Rayon issues would explain the reason for the stack overflow errors:

From 854,

Rayon has implicit "recursion" due to work stealing. That is, whenever a rayon thread is blocked on the result from another rayon thread, it will look for other pending work to do in the meantime. That stolen work is executed directly from the same stack where it was blocked.

Your par_iter().for_each() becomes a bunch of nested joins, and each one of those may block if one half gets stolen to a new thread. Since stealing is somewhat random, the pool will have a mix of stolen joins and new spawns creating even more joins, and I can definitely see how that might get out of control. You're not doing anything wrong, but I'm not sure how to tame that.

polachok · 2025-02-08T17:54:21Z

Thanks, i think you're right and this can be closed.
I ran it under gdb, and it just shows Segmentation fault

tatsuya6502 self-assigned this Feb 4, 2025

tatsuya6502 mentioned this issue Feb 4, 2025

Segfault with rayon & moka crossbeam-rs/crossbeam#1175

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfault with rayon & moka #495

Segfault with rayon & moka #495

tatsuya6502 commented Feb 4, 2025

tatsuya6502 commented Feb 4, 2025 •

edited

Loading

tatsuya6502 commented Feb 5, 2025

polachok commented Feb 8, 2025 •

edited

Loading

Segfault with rayon & moka #495

Segfault with rayon & moka #495

Comments

tatsuya6502 commented Feb 4, 2025

tatsuya6502 commented Feb 4, 2025 • edited Loading

tatsuya6502 commented Feb 5, 2025

polachok commented Feb 8, 2025 • edited Loading

tatsuya6502 commented Feb 4, 2025 •

edited

Loading

polachok commented Feb 8, 2025 •

edited

Loading