You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After recent changes, Polars debug builds, which integrate tracemalloc tracking, started experiencing deadlocks. Ripping out the tracemalloc integration fixed them (pola-rs/polars#21231). This started happening specifically when GitHub Actions switched to 3.12.9. I imagine this is due to the recent tracemalloc bug fixes.
No reproducer yet, but as a starting point:
I am fairly certain this was not after doing tracemalloc.start(), since it happened in e.g. doctest runs that don't do that at all.
Here's a plausible theory: the way Polars works is there's a main thread that dispatches tasks to a Rust thread pool, and waits for results. When using Python version of Polars, it's important that main thread release the GIL before dispatching to the thread pool, for situations where operations running threads may acquire the GIL. So I went and did that in the relevant APIs. Many operations would never acquire the GIL so likely I missed some cases.
With latest version of tracemalloc, when tracemalloc is compiled into Polars, any Rust memory allocation will acquire the GIL. So the scope for deadlocks is now much broader. Any operation will acquire the GIL.
If this is the case, is it a bug in CPython? Maaaaybe? It's a change in behavior, certainly always acquiring GIL when allocations are tracked, even when tracemalloc.start() isn't called. But arguably if you're going to work with tracemalloc you need to handle GIL acquisition since that was always going to happen after tracemalloc.start() was called.
So maybe not a CPython bug. But probably worth at minimum documenting that PyTraceMalloc_Track() acquires the GIL.
I will try to validate this theory, and if it is correct I will close this issue.
My theory above was correct so going to try to fix this in Polars. In any case if someone else gets deadlocks in latest CPython patch versions, the above might be relevant.
Bug report
Bug description:
After recent changes, Polars debug builds, which integrate tracemalloc tracking, started experiencing deadlocks. Ripping out the tracemalloc integration fixed them (pola-rs/polars#21231). This started happening specifically when GitHub Actions switched to 3.12.9. I imagine this is due to the recent tracemalloc bug fixes.
No reproducer yet, but as a starting point:
tracemalloc.start()
, since it happened in e.g. doctest runs that don't do that at all.PyTraceMalloc_Track()
: https://github.com/pola-rs/polars/blob/f378b24aeaaa14a8e7dacb6b35103424e8b97f1c/py-polars/src/memory.rs#L53So this suggests some mixture of GIL-holding and non-GIL holding threads calling
PyTraceMalloc_Track()
and friends can cause deadlocks.CPython versions tested on:
3.12
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: