Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core][1/N] Make executor to be a long-running Python thread #51016

Merged
merged 2 commits into from
Mar 2, 2025

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented Mar 1, 2025

Why are these changes needed?

This is the same as #50644

Currently, tasks are sometimes executed on the main thread and sometimes on other threads which is created by C++. However, C++ threads are only considered Python threads when they execute Python functions. Hence, when a task finishes, the thread-local state will be garbage-collected. See #46336 (comment) for more details.

This PR uses the CPython API to treat C++ threads as Python threads, even if they do not execute Python functions.
https://docs.python.org/3/c-api/init.html#non-python-created-threads

This PR handles synchronous tasks. After this PR is merged, the follow-up tasks are:

  • Always create a default executor.
  • Move the constructor to the default executor.
  • Support asynchronous tasks.

Related issue number

Part of #46336

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: kaihsun <kaihsun@anyscale.com>
Signed-off-by: kaihsun <kaihsun@anyscale.com>
@kevin85421 kevin85421 added the go add ONLY when ready to merge, run all tests label Mar 1, 2025
@kevin85421
Copy link
Member Author

Are we certain that the gstate object does not need to be reference counted? Just want to make sure there is no concern of crashing during the shutdown sequence due to a memory corruption issue.

#50644 (review)

  1. https://github.com/python/cpython/blob/75f38af7810af1c3ca567d6224a975f85aef970f/Include/pystate.h#L76-L78 PyGILState_STATE is merely an enum. It is not a PyObject, so it is not managed by reference counting.

  2. I ran this example 100 times and all of them pass without a memory corruption issue.

@kevin85421
Copy link
Member Author

I messed up the commits in the old PR, so I created a new one. Additionally, this comment contains details of the investigation I conducted on PyGILState_STATE.

@kevin85421 kevin85421 marked this pull request as ready for review March 1, 2025 09:32
Copy link
Contributor

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@edoakes edoakes merged commit 543e6e4 into ray-project:master Mar 2, 2025
6 checks passed
VamshikShetty pushed a commit to VamshikShetty/ray that referenced this pull request Mar 3, 2025
…ject#51016)

This is the same as ray-project#50644

Currently, tasks are sometimes executed on the main thread and sometimes
on other threads which is created by C++. However, C++ threads are only
considered Python threads when they execute Python functions. Hence,
when a task finishes, the thread-local state will be garbage-collected.
See
ray-project#46336 (comment)
for more details.

This PR uses the CPython API to treat C++ threads as Python threads,
even if they do not execute Python functions.
https://docs.python.org/3/c-api/init.html#non-python-created-threads

This PR handles synchronous tasks. After this PR is merged, the
follow-up tasks are:
* Always create a default executor.
* Move the constructor to the default executor.
* Support asynchronous tasks.

## Related issue number

Part of ray-project#46336

---------

Signed-off-by: kaihsun <kaihsun@anyscale.com>
Signed-off-by: vs030455 <vamshikdshetty@gmail.com>
Michaelhess17 pushed a commit to Michaelhess17/ray that referenced this pull request Mar 3, 2025
…ject#51016)

This is the same as ray-project#50644

Currently, tasks are sometimes executed on the main thread and sometimes
on other threads which is created by C++. However, C++ threads are only
considered Python threads when they execute Python functions. Hence,
when a task finishes, the thread-local state will be garbage-collected.
See
ray-project#46336 (comment)
for more details.

This PR uses the CPython API to treat C++ threads as Python threads,
even if they do not execute Python functions.
https://docs.python.org/3/c-api/init.html#non-python-created-threads

This PR handles synchronous tasks. After this PR is merged, the
follow-up tasks are:
* Always create a default executor.
* Move the constructor to the default executor.
* Support asynchronous tasks.

## Related issue number

Part of ray-project#46336 

---------

Signed-off-by: kaihsun <kaihsun@anyscale.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this pull request Mar 4, 2025
…ject#51016)

This is the same as ray-project#50644

Currently, tasks are sometimes executed on the main thread and sometimes
on other threads which is created by C++. However, C++ threads are only
considered Python threads when they execute Python functions. Hence,
when a task finishes, the thread-local state will be garbage-collected.
See
ray-project#46336 (comment)
for more details.

This PR uses the CPython API to treat C++ threads as Python threads,
even if they do not execute Python functions.
https://docs.python.org/3/c-api/init.html#non-python-created-threads

This PR handles synchronous tasks. After this PR is merged, the
follow-up tasks are:
* Always create a default executor.
* Move the constructor to the default executor.
* Support asynchronous tasks.

## Related issue number

Part of ray-project#46336 

---------

Signed-off-by: kaihsun <kaihsun@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants