Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ray fails to serialize self-reference objects #1234

Closed
suquark opened this issue Nov 20, 2017 · 4 comments
Closed

Ray fails to serialize self-reference objects #1234

suquark opened this issue Nov 20, 2017 · 4 comments

Comments

@suquark
Copy link
Member

suquark commented Nov 20, 2017

System information

  • Ray installed from (source or binary): pip
  • Ray version: 0.2.2
  • Python version: 3.6.2

Describe the problem

Ray fails to serialize self-reference objects (for example, Graph objects in networkx).

I think it is because ray always tries to use pyarrow first and does not catch pyarrow.lib.ArrowNotImplementedError, see

ray/python/ray/worker.py

Lines 285 to 289 in e0360eb

try:
self.plasma_client.put(value, pyarrow.plasma.ObjectID(
object_id.id()), self.serialization_context)
break
except pyarrow.SerializationCallbackError as e:

After catching pyarrow.lib.ArrowNotImplementedError, we should not use use_dict=True as a workaround, because it will cause endless loop. A correct approach may be:

            except (pyarrow.SerializationCallbackError, pyarrow.lib.ArrowNotImplementedError) as e:
                try:
                    if isinstance(e, pyarrow.lib.ArrowNotImplementedError):
                        e.example_object = value
                        raise e  # redirect to use cloudpickle

Source code / logs

class Graph:
    def __init__(self):
        self.g = self

G = Graph()
ray.put(G)  # --> pyarrow.lib.ArrowNotImplementedError: This object exceeds the maximum recursion depth. It may contain itself recursively.

# another example

import networkx as nx
G = nx.Graph()
    
G.add_edges_from([(1, 2), (1, 3)])
G.add_node(1)
G.add_edge(1, 2)
G.add_node("spam")  # adds node "spam"
G.add_nodes_from("spam")  # adds 4 nodes: 's', 'p', 'a', 'm'
G.add_edge(3, 'm')
ray.put(G)  # --> pyarrow.lib.ArrowNotImplementedError: This object exceeds the maximum recursion depth. It may contain itself recursively.

@mitar

@robertnishihara
Copy link
Collaborator

Can you try ray.register_custom_serializer? The following works for me.

import ray
ray.init()

class Graph:
    def __init__(self):
        self.g = self

ray.register_custom_serializer(Graph, use_pickle=True)

G = Graph()
ray.put(G)

This is closely related to #319 and https://issues.apache.org/jira/browse/ARROW-1382.

A side comment. The original code worked for me in Python 2 because in Python 2 Graph is an old-style class and so we automatically fall back to Pickle anyway I think.

@mitar
Copy link
Member

mitar commented Nov 20, 2017

Hm, so ideally we would like to serialize networkx graphs. Because they can be quite large, I am not sure if pickling is a good approach.

@robertnishihara
Copy link
Collaborator

Custom serializers/deserializers can be registered with the same approach. Not sure what the right one would be in this case, but just as a simple example, you could do something like

import numpy as np
import ray

ray.init()

class Graph:
    def __init__(self, big_array):
        self.g = self
        self.big_array = big_array

def custom_graph_serializer(obj):
    return obj.big_array

def custom_graph_deserializer(serialized_obj):
    return Graph(serialized_obj)

ray.register_custom_serializer(Graph,
                               serializer=custom_graph_serializer,
                               deserializer=custom_graph_deserializer)

G = Graph(np.ones(100))
ray.put(G)

@edoakes
Copy link
Contributor

edoakes commented Mar 5, 2020

Stale - please open new issue if still relevant

@edoakes edoakes closed this as completed Mar 5, 2020
fishbone added a commit that referenced this issue Nov 3, 2021
## Why are these changes needed?
This is part of redis removal project. This PR is going to enable grpc based broadcasting by default.

## Related issue number

<!-- For example: "Closes #1234" -->
#19438 
## Checks
rkooo567 added a commit that referenced this issue Nov 16, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

There's one user who has an issue that one of raylets cannot schedule tasks anymore because `num_worker_not_started_by_job_config_not_exist ` > 0.

This PR adds better log messages to figure out if the root cause is the job information is not properly propagated from GCS to raylet through Redis pubsub. 

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
wuisawesome pushed a commit that referenced this issue Nov 16, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

This pin is needed to fix `test_output` on master, which broke when 4.0.0 was released. 

It may also fix the windows build (unsure). 

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
rkooo567 pushed a commit that referenced this issue Nov 17, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

The change in #20374 was interpreted as a file redirect, not a "greater than" by docker (strangely enough, differently than bash interprets it locally). 

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex <alex@anyscale.com>
wuisawesome pushed a commit that referenced this issue Nov 17, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

This PR adds the hiredis dependency for non M1 machines. 

This removes the `redis < 4.0` pin.

Since hiredis doesn't have M1 mac wheels yet, so users there will have extra warning messages in their outputs if they use redis 4.0.
<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex Wu <alex@anyscale.com>
fishbone pushed a commit that referenced this issue Nov 18, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

The change in #20374 was interpreted as a file redirect, not a "greater than" by docker (strangely enough, differently than bash interprets it locally). 

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex <alex@anyscale.com>
wuisawesome pushed a commit that referenced this issue Nov 20, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

There's one user who has an issue that one of raylets cannot schedule tasks anymore because `num_worker_not_started_by_job_config_not_exist ` > 0.

This PR adds better log messages to figure out if the root cause is the job information is not properly propagated from GCS to raylet through Redis pubsub. 

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
wuisawesome pushed a commit that referenced this issue Nov 20, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

This pin is needed to fix `test_output` on master, which broke when 4.0.0 was released. 

It may also fix the windows build (unsure). 

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
wuisawesome pushed a commit that referenced this issue Nov 20, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

The change in #20374 was interpreted as a file redirect, not a "greater than" by docker (strangely enough, differently than bash interprets it locally). 

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex <alex@anyscale.com>
wuisawesome pushed a commit that referenced this issue Nov 21, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

There's one user who has an issue that one of raylets cannot schedule tasks anymore because `num_worker_not_started_by_job_config_not_exist ` > 0.

This PR adds better log messages to figure out if the root cause is the job information is not properly propagated from GCS to raylet through Redis pubsub. 

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
wuisawesome pushed a commit that referenced this issue Nov 21, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

This pin is needed to fix `test_output` on master, which broke when 4.0.0 was released. 

It may also fix the windows build (unsure). 

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
wuisawesome pushed a commit that referenced this issue Nov 21, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

The change in #20374 was interpreted as a file redirect, not a "greater than" by docker (strangely enough, differently than bash interprets it locally). 

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex <alex@anyscale.com>
rkooo567 added a commit that referenced this issue Nov 23, 2021
…" (#20668)

This reverts commit e9132ed.

<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

Seems to break Windows build. 

```
(07:46:25) ERROR: BUILD.bazel:406:11: Compiling src/ray/common/task/task_spec.cc failed: (Exit 2): cl.exe failed: error executing command
```

<img width="487" alt="Screen Shot 2021-11-23 at 3 09 18 AM" src="https://user-images.githubusercontent.com/18510752/143013973-f157724c-4951-49a9-80c6-158d41aa4295.png">


## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
rkooo567 added a commit that referenced this issue Jun 1, 2022
This reverts commit 02f220b.

<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

Looks like this commit makes `test_ray_shutdown` way more flaky.  cc @mattip for further investigation after revert
<img width="760" alt="Screen Shot 2022-05-31 at 11 14 48 PM" src="https://user-images.githubusercontent.com/18510752/171339737-f48e6e90-391a-4235-bfac-a0aa0e563eb7.png">


## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
rkooo567 added a commit that referenced this issue Jan 16, 2023
#31454)

…28)"

This reverts commit a0c894f.

<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
andreapiso pushed a commit to andreapiso/ray that referenced this issue Jan 22, 2023
)" (ray-project#313… (ray-project#31454)

…28)"

This reverts commit a0c894f.

<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: Andrea Pisoni <andreapiso@gmail.com>
jcoffi added a commit to jcoffi/ray that referenced this issue Jan 25, 2023
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
jcoffi added a commit to jcoffi/ray that referenced this issue Jan 26, 2023
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
rkooo567 pushed a commit that referenced this issue Jan 26, 2023
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?
These flags are no longer useful because the migration has been finished. Delete them.
<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
jcoffi added a commit to jcoffi/ray that referenced this issue Feb 14, 2023
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
jcoffi added a commit to jcoffi/ray that referenced this issue Feb 15, 2023
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…able (ray-project#50176)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

This change makes `AUTOSCALER_MAX_RESOURCE_DEMAND_VECTOR_SIZE`
configurable.

Power users may wish to submit more than 1000 tasks at once and have the
autoscaler respond by immediately scaling up the requisite number of
nodes.

To make this happen, `AUTOSCALER_MAX_RESOURCE_DEMAND_VECTOR_SIZE` must
be increased beyond the 1000 cap; otherwise, the demand from most tasks
is ignored and upscaling is slow.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

Limited `AUTOSCALER_MAX_RESOURCE_DEMAND_VECTOR_SIZE` causes the issue
experienced in ray-project#45373.

This PR provides a workaround.
After merging this PR, if a user wants, say, 10k tasks to trigger quick
upscaling, then the user can increase
`AUTOSCALER_MAX_RESOURCE_DEMAND_VECTOR_SIZE` past 10k.

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
  
I tested it experimentally by increasing
`AUTOSCALER_MAX_RESOURCE_DEMAND_VECTOR_SIZE` to 100k and submitting 10k
tasks; upscaling happened smoothly.

---------

Signed-off-by: Dmitri Gekhtman <dmitri.gekhtman@getcruise.com>
Co-authored-by: Dmitri Gekhtman <dmitri.gekhtman@getcruise.com>
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
ray-project#49678)

there are two **same** param: --**runtime-env-agent-port**, remove one.

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: hefeiyun <hfy231-30@163.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

Get user request to support class constructor args for
`dataset.filter()`, similar to flat_map, map, and map_batches.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: liuxsh9 <liuxiaoshuang4@huawei.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…oject#50363)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Starting with [grpc
v1.32.0](https://github.com/grpc/grpc/tree/v1.32.0/src/python/grpcio/grpc/experimental/aio),
the aio module was moved out of the experimental directory into the main
directories. The latest grpc release is v1.70.1, and v1.32.0 was
released 4.5 years ago.

I searched the Ray codebase for references to grpcio. Most of them
specify grpcio v1.66.1. The oldest reference to a grpcio version is
v1.32.0, which has already been moved to the main directories.


https://github.com/ray-project/ray/blob/9e3ec5972cd952d2b50f3b20abc24ced5abb8b54/python/setup.py#L259

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: kaihsun <kaihsun@anyscale.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?
This change is needed to prevent redis password been logged in the
standard logging. This is a secure vulnerability.

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->
Closes ray-project#50266

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [x] This PR is not tested :(

---------

Signed-off-by: Letao Jiang <letaoj@gmail.com>
Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…ack link (ray-project#50491)

<img width="1725" alt="Screenshot 2025-02-12 at 10 23 40 AM"
src="https://github.com/user-attachments/assets/79396c2f-4eb8-4e70-8310-f64b9851679d"
/>

<img width="1720" alt="Screenshot 2025-02-12 at 10 50 48 AM"
src="https://github.com/user-attachments/assets/7d8cd221-393a-42a0-aee3-5edb3b23104e"
/>




<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Chris Zhang <chris@anyscale.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

1. Added the `JSON` logging format option in Ray
2. Update the API doc as well as the ray logging doc according to the
change:
    - Rearranged the sections in existing ray logging document
- Copied the content of the structured logging document from Anyscale
document to ray logging document

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Co-authored-by: Dhyey Shah <dhyey2019@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

This PR contains some small edits to the TPU Ray documentation to
clarify the sections, specifically making it more clear that the [Ray
TPU initialization
webhook](https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main/ray-on-gke/tpu/kuberay-tpu-webhook)
is included in the GKE Ray addon and manual installation is optional.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

This is the first step towards
ray-project#47933

It is not very tested at the moment (on Python 3.13), but it compiles
locally (with `pip install -e . --verbose`) and can execute a simple
workload like
```
>>> import ray
>>> ray.init()
2024-10-10 16:03:31,857	INFO worker.py:1799 -- Started a local Ray instance.
RayContext(dashboard_url='', python_version='3.13.0', ray_version='3.0.0.dev0', ray_commit='{{RAY_COMMIT_SHA}}')
>>> @ray.remote
... def f():
...     return 42
...     
>>> ray.get(f.remote())
42
>>> 
```
(and similar for actors).

The main thing that needed to change to make Ray work on Python 3.13 was
to upgrade Cython to 3.0.11 which seems to be the first version of
Cython to support Python 3.13. Unfortunately it has a compiler bug
cython/cython#3235 (the fix is not released yet)
that I had to work around.

I also had to work around cython/cython#5750
by changing some typing from `float` to `int | float`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: pcmoritz <pcmoritz@anyscale.com>
Co-authored-by: srinathk10 <68668616+srinathk10@users.noreply.github.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…ray-project#50462)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Next step after ray-project#50160 to make it
more convenient to use UV with Ray.

This is a useful runtime environment hook for mirroring the environment
of `uv run` to the workers (currently the args to uv run and the
working_dir). This is useful because it will allow people to intuitively
use `uv run` in a distributed application with the same behavior as for
a single python process.

This only modifies the environment if the driver was run with `uv run`
and could conceivably become the default for drivers run with uv run.

This is currently a developer API as implied by the fact that it is in
the `_private` namespace. It is currently for experimentation and can
needs to be opted in via

```shell
export RAY_RUNTIME_ENV_HOOK=ray._private.runtime_env.uv_runtime_env_hook.hook
```

If it works well, we might make it the default in the `uv run` case.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…ray-project#50597)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

This fixes an issue @erictang000 was running into while using the uv
runtime env hook for https://github.com/hiyouga/LLaMA-Factory. LLaMA
Factory modifies sys.argv before calling ray.init (and therefore the
hook), which broke the original logic. We instead use the value from
`/proc/{pid}/cmdline` and add a test that we can tolerate changes of
sys.argv.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

After ray-project#47984 we can now build the
wheels.

Note that these are not tested yet, since unfortunately conda doesn't
currently support Python 3.13 as the base environment
(conda/conda#14353) and changing everything to
run in a custom conda environment seems to require a decent amount of
changes.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…els (ray-project#50667)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Add nightly Python 3.13 wheels (marked as alpha) to the documentation
and also add aarch64 linux wheels which I forgot in
ray-project#50531

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…oject#50674)

## Why are these changes needed?

Adds user guide and link-ins for Ray Data documentation.

This is part of the ray-project#50639 thread of work.

This is based on ray-project#50494 

cc @comaniac @gvspraveen @kouroshHakha  

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…ray-project#50714)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

This is part of ray-project#48133.
Continuing the approach taken in
ray-project#49426, make all the discretizers
work in append mode

## Related issue number

[<!-- For example: "Closes ray-project#1234"
-->](ray-project#48133)

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [x] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [x] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: Martin Bomio <martinbomio@spotify.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
… node (ray-project#50712)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

* `iter_torch_batches` release test only needs GPU for the head node.
Make it use CPU worker nodes.
* Batch inference release tests don't need GPU for the head node. Update
its compute config.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Hao Chen <chenh1024@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…#50747)

## Why are these changes needed?

LLM API documentation updates from dogfooding @comaniac 
## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…50772)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->
Adding Anyscale as an option to get started with Ray. This is the first
round of changes to add the option to 3 pages.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Sijie Wang <3463757+sijieamoy@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
## Why are these changes needed?

Docstring improvements

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…#50788)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Adding requirements sections on the llm overview page to let user know
what dependencies to install and suggest to pin `xgrammar==0.1.11` and
`pynvml==12.0.0` when paired with vllm 0.7.2

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Gene Su <e870252314@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
- add **a**~n~ reviewer ...

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

see:
- To assign **a** reviewer in

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/requesting-a-pull-request-review

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [x] render the file CONTRIBUTING.rst

Signed-off-by: Serge Croisé <SergeCroise@users.noreply.github.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?
As part of the initiative to introduce cpplint into the pre-commit hook,
we are gradually cleaning up C++ folders to ensure compliance with code
style requirements. This issue focuses on cleaning up `src/ray/pubsub`.
<!-- Please give a short summary of the change and the problem this
solves. -->
- This is the command that I have ran
```
cpplint \
  --filter=-whitespace/line_length,\
-build/c++11,\
-build/c++14,\
-build/c++17,\
-readability/braces,\
-whitespace/indent_namespace,\
-runtime/int,\
-runtime/references,\
-build/include_order \
  src/ray/pubsub/*.h \
  src/ray/pubsub/*.cc \
  src/ray/pubsub/**/*.h \
  src/ray/pubsub/**/*.cc 
```
- The log output
```
Skipping input 'src/ray/pubsub/**/*.h': Can't open for reading
src/ray/pubsub/publisher.cc:416:  Add #include <memory> for make_unique<>  [build/include_what_you_use] [4]
src/ray/pubsub/publisher.cc:442:  Add #include <utility> for move  [build/include_what_you_use] [4]
src/ray/pubsub/publisher.cc:504:  Add #include <vector> for vector<>  [build/include_what_you_use] [4]
src/ray/pubsub/publisher.cc:543:  Add #include <string> for string  [build/include_what_you_use] [4]
Done processing src/ray/pubsub/publisher.cc
src/ray/pubsub/publisher.h:94:  Single-parameter constructors should be marked explicit.  [runtime/explicit] [4]
src/ray/pubsub/publisher.h:309:  Add #include <vector> for vector<>  [build/include_what_you_use] [4]
src/ray/pubsub/publisher.h:316:  Add #include <utility> for move  [build/include_what_you_use] [4]
src/ray/pubsub/publisher.h:454:  Add #include <memory> for unique_ptr<>  [build/include_what_you_use] [4]
Done processing src/ray/pubsub/publisher.h
src/ray/pubsub/subscriber.cc:319:  Add #include <memory> for make_unique<>  [build/include_what_you_use] [4]
src/ray/pubsub/subscriber.cc:461:  Add #include <vector> for vector<>  [build/include_what_you_use] [4]
src/ray/pubsub/subscriber.cc:482:  Add #include <utility> for move  [build/include_what_you_use] [4]
src/ray/pubsub/subscriber.cc:522:  Add #include <string> for string  [build/include_what_you_use] [4]
Done processing src/ray/pubsub/subscriber.cc
src/ray/pubsub/subscriber.h:332:  Add #include <vector> for vector<>  [build/include_what_you_use] [4]
src/ray/pubsub/subscriber.h:413:  Add #include <string> for string  [build/include_what_you_use] [4]
src/ray/pubsub/subscriber.h:492:  Add #include <memory> for unique_ptr<>  [build/include_what_you_use] [4]
src/ray/pubsub/subscriber.h:497:  Add #include <utility> for pair<>  [build/include_what_you_use] [4]
Done processing src/ray/pubsub/subscriber.h
src/ray/pubsub/test/integration_test.cc:173:  Add #include <utility> for move  [build/include_what_you_use] [4]
src/ray/pubsub/test/integration_test.cc:234:  Add #include <vector> for vector<>  [build/include_what_you_use] [4]
Done processing src/ray/pubsub/test/integration_test.cc
src/ray/pubsub/test/publisher_test.cc:30:  Do not use namespace using-directives.  Use using-declarations instead.  [build/namespaces] [5]
src/ray/pubsub/test/publisher_test.cc:613:  Add #include <memory> for make_shared<>  [build/include_what_you_use] [4]
src/ray/pubsub/test/publisher_test.cc:827:  Add #include <algorithm> for max  [build/include_what_you_use] [4]
src/ray/pubsub/test/publisher_test.cc:1063:  Add #include <vector> for vector<>  [build/include_what_you_use] [4]
src/ray/pubsub/test/publisher_test.cc:1248:  Add #include <string> for string  [build/include_what_you_use] [4]
Done processing src/ray/pubsub/test/publisher_test.cc
src/ray/pubsub/test/subscriber_test.cc:23:  Extra space before [  [whitespace/braces] [5]
src/ray/pubsub/test/subscriber_test.cc:239:  Consider using ASSERT_EQ instead of ASSERT_TRUE(a == b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:245:  Consider using ASSERT_EQ instead of ASSERT_TRUE(a == b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:282:  Consider using ASSERT_EQ instead of ASSERT_TRUE(a == b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:292:  Consider using ASSERT_EQ instead of ASSERT_TRUE(a == b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:298:  Consider using ASSERT_EQ instead of ASSERT_TRUE(a == b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:299:  Consider using ASSERT_EQ instead of ASSERT_TRUE(a == b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:328:  Consider using ASSERT_EQ instead of ASSERT_TRUE(a == b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:379:  Consider using ASSERT_GT instead of ASSERT_TRUE(a > b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:410:  Consider using ASSERT_GT instead of ASSERT_TRUE(a > b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:418:  Consider using ASSERT_GT instead of ASSERT_TRUE(a > b)  [readability/check] [2]
src/ray/pubsub/test/subscriber_test.cc:108:  Add #include <utility> for move  [build/include_what_you_use] [4]
src/ray/pubsub/test/subscriber_test.cc:118:  Add #include <deque> for deque<>  [build/include_what_you_use] [4]
src/ray/pubsub/test/subscriber_test.cc:119:  Add #include <queue> for queue<>  [build/include_what_you_use] [4]
src/ray/pubsub/test/subscriber_test.cc:208:  Add #include <memory> for shared_ptr<>  [build/include_what_you_use] [4]
src/ray/pubsub/test/subscriber_test.cc:209:  Add #include <unordered_map> for unordered_map<>  [build/include_what_you_use] [4]
src/ray/pubsub/test/subscriber_test.cc:210:  Add #include <unordered_set> for unordered_set<>  [build/include_what_you_use] [4]
src/ray/pubsub/test/subscriber_test.cc:909:  Add #include <string> for string  [build/include_what_you_use] [4]
src/ray/pubsub/test/subscriber_test.cc:930:  Add #include <vector> for vector<>  [build/include_what_you_use] [4]
Done processing src/ray/pubsub/test/subscriber_test.cc
Total errors found: 42
```
> I've separated the changes for each cpplint error into separate
commits.
## Related issue number

<!-- For example: "Closes ray-project#1234" -->
Closes ray-project#50728 
## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Cheyu Wu <cheyu1220@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…#49168)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Our pattern of using Ray Serve has us deploying many hundreds/thousands
of apps using the imperative API (`serve.run`). This ends up being very
slow because the Controller needs to checkpoint as part of every RPC. It
would be significantly more efficient to batch the deploys so that we
can checkpoint fewer times.

This PR adds a new `serve.run_many()` public API, marked as
developer-only, that can submit many applications to the Serve
Controller in one RPC, with just a single checkpoint being saved after
all of those applications are registered. The entire existing code path
(including `serve.run()`) is refactored to be bulk operations under the
hood (`serve.run()` calls `serve.run_many()`).

To further help with our particular use case, where the applications are
being deployed from a controller that doesn't care about waiting for
e.g. ingress deployment creation, the new code path also has
fine-grained control over which things are waited for.

---

Just introducing a batch API isn't sufficient to actually provide a
meaningful speedup. As mentioned above, the thing that is slow is the
checkpointing, and right now, the checkpointing is very granular: the
various stateful components checkpoint themselves at the bottom of the
call stack, so even a single RPC might cause them to checkpoint multiple
times right now.

Below I've tried to map out all the reasons that the
`Application/DeploymentStateManager`s might checkpoint:

```mermaid
graph TD;
    deployment_state_set_target_state[DeploymentState._set_target_state] --> dsm_checkpoint[DeploymentStateManager._save_checkpoint_func]
    deployment_state_deploy[DeploymentState.deploy] --> deployment_state_set_target_state
    deployment_state_manager_deploy[DeploymentStateManager.deploy] --> deployment_state_deploy
    application_state_apply_deployment_info[ApplicationState.apply_deployment_info] --> deployment_state_manager_deploy
    application_state_reconcile_target_deployments[ApplicationState._reconcile_target_deployments] --x application_state_apply_deployment_info
    application_state_update[ApplicationState.update] --> application_state_reconcile_target_deployments
    application_state_manager_update[ApplicationStateManager.update] --x application_state_update
    serve_controller_run_control_loop[ServeController.run_control_loop] --> application_state_manager_update
    


    deployment_state_set_target_state_deleting[DeploymentState._set_target_state_deleting] --> dsm_checkpoint
    deployment_state_delete[DeploymentState.delete] --> deployment_state_set_target_state_deleting
    deployment_state_manager_delete_deployment[DeploymentStateManager.delete_deployment] --> deployment_state_delete
    application_state_delete_deployment[ApplicationState._delete_deployment] --> deployment_state_manager_delete_deployment
    application_state_reconcile_target_deployments --> application_state_delete_deployment

    deployment_state_autoscale[DeploymentState.autoscale] --> deployment_state_set_target_state
    deployment_state_manager_update[DeploymentStateManager.update] --> deployment_state_autoscale
    serve_controller_run_control_loop --> deployment_state_manager_update

    as_set_target_state[ApplicationState._set_target_state] --> asm_checkpoint[ApplicationStateManager._save_checkpoint_func]
 
 
 
 
as_recover_target_state_from_checkpoint[ApplicationState.recover_target_state_from_checkpoint] --> as_set_target_state
    asm_recover_from_checkpoint[ApplicationStateManager._recover_from_checkpoint] --> as_recover_target_state_from_checkpoint
    asm_init[ApplicationStateManager.__init__] --> asm_recover_from_checkpoint
    sc_init[ServeController.__init__] --> asm_init

    as_set_target_state_deleting[ApplicationState._set_target_state_deleting] --> as_set_target_state
    as_delete[ApplicationState.delete] --> as_set_target_state_deleting
    asm_delete_app[ApplicationStateManager.delete_app] --> as_delete
    sc_delete_apps[ServeController.delete_apps] --x asm_delete_app
    RPC --> sc_delete_apps

    as_clear_target_state_and_store_config[ApplicationState._clear_target_state_and_store_config] --> as_set_target_state
    as_apply_app_config[ApplicationState.apply_app_config] --> as_clear_target_state_and_store_config
    asm_apply_app_configs[ApplicationStateManager.apply_app_configs] --x as_apply_app_config
    sc_apply_config[ServeController.apply_config] --> asm_apply_app_configs
    RPC --> sc_apply_config


    as_deploy_app[ApplicationState.deploy_app] --> as_set_target_state
    asm_deploy_app[ApplicationStateManager.deploy_app] --> as_deploy_app
    sc_deploy_application[ServeController.deploy_application] --> asm_deploy_app
    RPC --> sc_deploy_application

    as_apply_app_config --> as_set_target_state
```

So, in addition to the batch API that the client sees, I've refactored
where these checkpoints are done so that they happen at the *top* of
those call stacks instead of at the bottom.

- We still checkpoint before (now just before) returning an RPC that
mutates state.
- We still checkpoint after making any changes to internal state and
before issuing any commands to the cluster to e.g. start/stop replicas
(just not *immediately* after making the internal state change).

I did *not* change the `EndpointState`'s checkpointing because it hasn't
shown up in our flamegraphs.

---

Before these changes, deploying 5k Serve apps, each with one deployment,
took >1 hour and would often never finish because the Serve Controller
would become unresponsive and KubeRay would end up restarting the
cluster.

With these changes, deploying 5k Serve apps with a batch size of 100 per
API call only takes about 90 seconds!

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [x] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [x] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Josh Karpel <josh.karpel@gmail.com>
Co-authored-by: Cindy Zhang <cindyzyx9@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
ray-project#49409)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?
I recently used the `ray job submit` CLI and got this confused error
message
```sh
TypeError: 'NoneType' object is not callable
```

<!-- Please give a short summary of the change and the problem this
solves. -->

As the ray-project#41604 mentioned, it need a proper error message instead of
raising `NoneType` error.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

Related to ray-project#41604

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: Andrew <orcahmlee@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: akshay-anyscale <122416226+akshay-anyscale@users.noreply.github.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

Fix warnings from Cython compiler.

<img width="1521" alt="Screenshot 2025-02-27 at 12 54 17 AM"
src="https://github.com/user-attachments/assets/865f1fe4-4fdc-4818-b00e-60513c138a50"
/>


## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: kaihsun <kaihsun@anyscale.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
…0962)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [x] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: Huaiwei Sun <scottsun94@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
## Why are these changes needed?

Should be used together with
ray-project/pygloo#34 so that the parameters can
be properly passed to gloo.

<!-- Please give a short summary of the change and the problem this
solves. -->

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [X] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [X] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [X] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [X] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Hollow Man <hollowman@opensuse.org>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
Remove a duplicated sentence

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: maofagui <maofg92@163.com>
Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
ArturNiederfahrenhorst pushed a commit that referenced this issue Mar 4, 2025
## Why are these changes needed
The actual parameter count method uses multiple loops to count the
number of parameters. Furthermore it uses `np.prod` and `filter`. The
proposed method in this PR instead uses native torch C++ code to count
parameters in a single loop without filtering.

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants