Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callbacks pended by Py_AddPendingCall will never be invoked if the main thread doesn't release GIL #95820

Open
JiayiFeng opened this issue Aug 9, 2022 · 5 comments
Assignees
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@JiayiFeng
Copy link

JiayiFeng commented Aug 9, 2022

According to the Py_AddPendingCall's document (https://docs.python.org/3/c-api/init.html#c.Py_AddPendingCall), a pended callback function should be invoked when:

It will be called asynchronously with respect to normally running Python code, but with both these conditions met:

- on a bytecode boundary;

- with the main thread holding the global interpreter lock.

So if the main thread is processing an endless loop like:

while True:
    pass

A callback pended from another thread by Py_AddPendingCall should still be able to be invoked.

It works in Python 3.8.13, but doesn't work in Python 3.9.13. In 3.9.13, the callback pended by Py_AddPendingCall will never be invoked in this case.

The root cause may be here: https://github.com/python/cpython/pull/19091/files#diff-c22186367cbe20233e843261998dc027ae5f1f8c0d2e778abfa454ae74cc59deL152-L153. After this change, pending a callback from a non-main thread will not set the ceval2->eval_breaker to be true. So when the main thread is trapped in an endless loop and keeps holding GIL, pending callbacks will never be invoked.

@JiayiFeng JiayiFeng added the type-bug An unexpected behavior, bug, or error label Aug 9, 2022
@mdboom mdboom added the 3.9 only security fixes label Aug 9, 2022
@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Aug 20, 2022

Reproducer:

#include "Python.h"
#include "pthread.h"
#include <unistd.h>

static struct PyModuleDef bugmodule = {
    PyModuleDef_HEAD_INIT,
    "bug",
    NULL,
    -1,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL
};


int cb(void *arg) {
    printf("callback called %ld\n", (int64_t)arg);
    return 0;
}

static void *func(void *arg)
{
    printf("in func\n");
    int64_t i = 0;
    while (1) {
        sleep(1);
        Py_AddPendingCall(cb, (void *)i++);
    }
    return NULL;
}

PyMODINIT_FUNC
PyInit_bug(void)
{
    pthread_t thread;
    pthread_create(&thread, NULL, func, NULL);
    return PyModule_Create(&bugmodule);
}

Python script:

import bug

while True:
    pass

On main:

@kumaraditya303 ➜ /workspaces/cpython (main ✗) $ ./python main.py 
in func

The pending calls are not executed and is broken.

@kumaraditya303 kumaraditya303 added 3.11 only security fixes 3.10 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed 3.9 only security fixes labels Aug 20, 2022
@JiayiFeng
Copy link
Author

JiayiFeng commented Aug 23, 2022

Here is another demo to reproduce the issue:

#include "Python.h"
#include <chrono>
#include <iostream>
#include <thread>

int main() {
  constexpr char code[] = R"PYTHON(
while True:
    pass
)PYTHON";
  std::thread th([&]() {
    Py_Initialize();
    PyRun_SimpleString(code);
    Py_Finalize();
  });

  // sleep a little time to make sure the Python interpreter is running.
  std::this_thread::sleep_for(std::chrono::seconds(1));

  // try to interrupt Python interpreter
  Py_AddPendingCall(
      [](void *) {
        PyErr_SetString(PyExc_KeyboardInterrupt, "abort");
        return -1;
      },
      nullptr);

  th.join();
  std::cout << "ok" << std::endl;
  return 0;
}

With Python 3.8.13, the demo ends normally, while with Python 3.9.12 it hangs forever.

@kumaraditya303
Copy link
Contributor

The fix is simple, just allows all threads to set eval breaker if pending calls to do.

@vstinner
Copy link
Member

vstinner commented Sep 7, 2022

I would expect your code to work, but you say that it doesn't. I'm not sure why.

This issue is related to issue #84191 and my commit d831688.

I added this function:

/* Only execute pending calls on the main thread. */
static inline int
_Py_ThreadCanHandlePendingCalls(void)
{
    return _Py_IsMainThread();
}

Current ceval code:

/* This can set eval_breaker to 0 even though gil_drop_request became
   1.  We believe this is all right because the eval loop will release
   the GIL eventually anyway. */
static inline void
COMPUTE_EVAL_BREAKER(PyInterpreterState *interp,
                     struct _ceval_runtime_state *ceval,
                     struct _ceval_state *ceval2)
{
    _Py_atomic_store_relaxed(&ceval2->eval_breaker,
        _Py_atomic_load_relaxed_int32(&ceval2->gil_drop_request)
        | (_Py_atomic_load_relaxed_int32(&ceval->signals_pending)
           && _Py_ThreadCanHandleSignals(interp))
        | (_Py_atomic_load_relaxed_int32(&ceval2->pending.calls_to_do)
           && _Py_ThreadCanHandlePendingCalls())
        | ceval2->pending.async_exc);
}

If there is a pending call, a break is only requested if it's the main thread. And the take_gil() functions does that:

    if (_Py_atomic_load_relaxed(&ceval2->gil_drop_request)) {
        RESET_GIL_DROP_REQUEST(interp);
    }
    else {
        /* bpo-40010: eval_breaker should be recomputed to be set to 1 if there
           is a pending signal: signal received by another thread which cannot
           handle signals.

           Note: RESET_GIL_DROP_REQUEST() calls COMPUTE_EVAL_BREAKER(). */
        COMPUTE_EVAL_BREAKER(interp, ceval, ceval2);
    }

If ceval, we should go to this place time to time to handle "pending things" like pending calls:

handle_eval_breaker:

    /* Do periodic things, like check for signals and async I/0.
     * We need to do reasonably frequently, but not too frequently.
     * All loops should include a check of the eval breaker.
     * We also check on return from any builtin function.
     */
    if (_Py_HandlePending(tstate) != 0) {
        goto error;
    }

In your example, it seems like there is no other Python thread waiting in take_gil() which calls SET_GIL_DROP_REQUEST() to ask the main Python thread to interrupt itself, pass the GIL, and then the pending call is handled when the main thread gets again the GIL.

@merlin66
Copy link

merlin66 commented Mar 24, 2023

I have been hit by this bug too, trying to upgrade from Python 3.8.6 to Python 3.11.2. We are just running Python in the main thread, and we call Py_AddPendingCall from a separate pure C/C++ thread. The pending calls are not being called from the main thread unless we do something special there, like calling time.sleep().

Based on the comments here we tried a workaround. Simply spawn a dummy Python thread, for example like this:

import threading
threading.Thread(target=lambda: threading.Event().wait()).start()

Once this has been started, the pending calls are being called again even when the main thread is just running while True: pass.
Mentioning this in case someone finds the workaround useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

6 participants