Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TSAN tests pass with the GIL disabled in free-threaded builds #117657

Open
2 of 23 tasks
Tracked by #108219
mpage opened this issue Apr 8, 2024 · 3 comments · Fixed by #123697
Open
2 of 23 tasks
Tracked by #108219

Make TSAN tests pass with the GIL disabled in free-threaded builds #117657

mpage opened this issue Apr 8, 2024 · 3 comments · Fixed by #123697
Assignees
Labels
topic-free-threading type-feature A feature request or enhancement

Comments

@mpage
Copy link
Contributor

mpage commented Apr 8, 2024

Feature or enhancement

We need to make the TSAN tests pass with the GIL disabled before we can disable the GIL by default in free-threaded builds.

This should proceed in two phases:

  1. Add suppressions for existing TSAN warnings.
  2. Triage, fix, and remove the suppressions for the warnings enumerated in (1).

How to run the TSAN tests

  1. Build with TSAN:
    env CC=clang CXX=clang++ ./configure --disable-gil --with-thread-sanitizer --with-pydebug && make -j
  2. Run tests:
    env TSAN_OPTIONS=suppressions=<repo_root>/Tools/tsan/suppressions_free_threading.txt ./python -m test --tsan -j4

Working on a race

  1. Verify that the TSAN tests are passing using the steps from above.
  2. Pick a suppression from the section below and assign it to yourself (edit it and add your username next to it). Some of them may be related. If you find that other suppressions are related to the race you're working on please assign them to yourself or contact their owner if they're already assigned.
  3. Delete the suppression from <repo_root>/Tools/tsan/suppressions_free_threading.txt , run the TSAN tests, and verify that the race is reported by TSAN. You may need to comment out unrelated functions (notably, _PyEval_EvalFrameDefault) in order to reproduce the race.
  4. Fix the race.

Suppressions

Tasks

Preview Give feedback

Linked PRs

@mpage mpage added type-feature A feature request or enhancement topic-free-threading labels Apr 8, 2024
@mpage mpage self-assigned this Apr 8, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Apr 12, 2024
colesbury pushed a commit that referenced this issue Apr 15, 2024
Additionally, reduce the iterations for a few weakref tests that would
otherwise take a prohibitively long amount of time (> 1 hour) when TSAN
is enabled and the GIL is disabled.
colesbury pushed a commit that referenced this issue Apr 15, 2024
…rld()` and `tstate_try_attach()` (#117828)

TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int`
on `tstate->state` in `tstate_try_attach()` and the non-atomic load of
`tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails,
but TSAN erroneously treats it as a store.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…ython#117736)

Additionally, reduce the iterations for a few weakref tests that would
otherwise take a prohibitively long amount of time (> 1 hour) when TSAN
is enabled and the GIL is disabled.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…the_world()` and `tstate_try_attach()` (python#117828)

TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int`
on `tstate->state` in `tstate_try_attach()` and the non-atomic load of
`tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails,
but TSAN erroneously treats it as a store.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
DinoV pushed a commit that referenced this issue Apr 17, 2024
…#117954)

Fix data races in the method cache in free-threaded builds

These are technically data races, but I think they're benign (to
the extent that that is actually possible). We update cache entries
non-atomically but read them atomically from another thread, and there's
nothing that establishes a happens-before relationship between the
reads and writes that I can see.
DinoV added a commit to mpage/cpython that referenced this issue Apr 17, 2024
DinoV pushed a commit that referenced this issue Apr 17, 2024
…#117955)

Quiet erroneous TSAN reports of data races in `_PySeqLock`

TSAN reports a couple of data races between the compare/exchange in
`_PySeqLock_LockWrite` and the non-atomic loads in `_PySeqLock_{Abandon,Unlock}Write`.
This is another instance of TSAN incorrectly modeling failed compare/exchange
as a write instead of a load.
DinoV added a commit that referenced this issue Apr 19, 2024
)

Use relaxed load to check if dictkeys are immortal
DinoV added a commit that referenced this issue Apr 19, 2024
…th strong enough semantics (#118111)

Use acquire for load of ob_ref_shared
DinoV pushed a commit that referenced this issue Apr 23, 2024
… `tstate->state` (#118165)

Quiet TSAN warnings about remaining non-atomic accesses of `tstate->state`
colesbury added a commit to colesbury/cpython that referenced this issue Feb 6, 2025
These tests crash under TSan due to stack overflows. Just skip them if
TSan is enabled.
colesbury added a commit to colesbury/cpython that referenced this issue Feb 6, 2025
Skip `test_no_copies_if_tlbc_disabled` when run under TSAN for now
due to a data race on the adaptive counter (see pythongh-129752).
colesbury added a commit to colesbury/cpython that referenced this issue Feb 6, 2025
Found while running `test_load_attr_module` from `test_opcache` under TSan.
colesbury added a commit that referenced this issue Feb 6, 2025
)

These tests crash under TSan due to stack overflows. Just skip them if
TSan is enabled.
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this issue Feb 7, 2025
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this issue Feb 7, 2025
…python#129751)

These tests crash under TSan due to stack overflows. Just skip them if
TSan is enabled.
colesbury added a commit that referenced this issue Feb 7, 2025
Found while running `test_load_attr_module` from `test_opcache` under TSan.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Feb 7, 2025
Found while running `test_load_attr_module` from `test_opcache` under TSan.
(cherry picked from commit 34379d0)

Co-authored-by: Sam Gross <colesbury@gmail.com>
colesbury added a commit that referenced this issue Feb 7, 2025
…129808)

Found while running `test_load_attr_module` from `test_opcache` under TSan.
(cherry picked from commit 34379d0)

Co-authored-by: Sam Gross <colesbury@gmail.com>
colesbury added a commit to colesbury/cpython that referenced this issue Feb 7, 2025
Fix a few thread-safety bugs to enable test_opcache when run with TSAN:

 * Use relaxed atomics when clearing `ht->_spec_cache.getitem`
   (pythongh-115999)
 * Add temporary suppression for type slot modifications (pythongh-127266)
 * Use atomic load when reading `*dictptr`
cmaloney pushed a commit to cmaloney/cpython that referenced this issue Feb 8, 2025
cmaloney pushed a commit to cmaloney/cpython that referenced this issue Feb 8, 2025
…python#129751)

These tests crash under TSan due to stack overflows. Just skip them if
TSan is enabled.
colesbury added a commit to colesbury/cpython that referenced this issue Feb 10, 2025
Skip `test_no_copies_if_tlbc_disabled` when run under TSAN for now
due to a data race on the adaptive counter (see pythongh-129752).
colesbury added a commit to colesbury/cpython that referenced this issue Feb 10, 2025
Fix a few thread-safety bugs to enable test_opcache when run with TSAN:

 * Use relaxed atomics when clearing `ht->_spec_cache.getitem`
   (pythongh-115999)
 * Add temporary suppression for type slot modifications (pythongh-127266)
 * Use atomic load when reading `*dictptr`
colesbury added a commit that referenced this issue Feb 11, 2025
Fix a few thread-safety bugs to enable test_opcache when run with TSAN:

 * Use relaxed atomics when clearing `ht->_spec_cache.getitem`
   (gh-115999)
 * Add temporary suppression for type slot modifications (gh-127266)
 * Use atomic load when reading `*dictptr`
colesbury added a commit that referenced this issue Feb 11, 2025
Skip `test_no_copies_if_tlbc_disabled` when run under TSAN for now
due to a data race on the adaptive counter (see gh-129752).
nascheme added a commit to nascheme/cpython that referenced this issue Feb 26, 2025
nascheme added a commit that referenced this issue Feb 26, 2025
The `PyType_HasFeature()` function reads the flags with a relaxed atomic
load and without holding the type lock.  To avoid data races, use atomic
stores if `PyType_Ready()` has already been called.
colesbury added a commit to colesbury/cpython that referenced this issue Feb 26, 2025
Fix a few thread-safety bugs to enable test_opcache when run with TSAN:

 * Use relaxed atomics when clearing `ht->_spec_cache.getitem`
   (pythongh-115999)
 * Add temporary suppression for type slot modifications (pythongh-127266)
 * Use atomic load when reading `*dictptr`
(cherry picked from commit f151d27)

Co-authored-by: Sam Gross <colesbury@gmail.com>
colesbury added a commit that referenced this issue Feb 26, 2025
Fix a few thread-safety bugs to enable test_opcache when run with TSAN:

 * Use relaxed atomics when clearing `ht->_spec_cache.getitem`
   (gh-115999)
 * Add temporary suppression for type slot modifications (gh-127266)
 * Use atomic load when reading `*dictptr`

(cherry picked from commit f151d27)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-free-threading type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants