Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting fixed number of training/testing episodes? #5071

Closed
peterhcyuen opened this issue Jun 30, 2019 · 3 comments
Closed

Supporting fixed number of training/testing episodes? #5071

peterhcyuen opened this issue Jun 30, 2019 · 3 comments
Labels
question Just a question :) stale The issue is stale. It will be closed within 7 days unless there are further conversation

Comments

@peterhcyuen
Copy link

Is this framework supporting to have a fixed number of training/testing episodes? As I added a stop criteria when running tune.run() method, for example, stop={"episodes_total": 100}, but the final result showed that it ran for more than 100 episodes.

@ericl ericl added the question Just a question :) label Jul 1, 2019
@ericl
Copy link
Contributor

ericl commented Jul 1, 2019

It will stop once the number of episodes exceeds that threshold. There is no way to do an exact stop, but the approximate value should be good enough.

@stale
Copy link

stale bot commented Nov 15, 2020

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Nov 15, 2020
@stale
Copy link

stale bot commented Nov 29, 2020

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

@stale stale bot closed this as completed Nov 29, 2020
zcin pushed a commit that referenced this issue Mar 3, 2025
## Why are these changes needed?

Make deployment retry count configurable through environment variable

## Related issue number

This PR addresses #5071 

Since i did not find any references to this behavior in the public doc,
decided not to update any `docs`, let me know if that's not true.

- Testing Strategy
### updated unit tests

### manual test

1. create a simple application
```
import logging
import requests
from fastapi import FastAPI
from ray import serve

fastapi = FastAPI()
logger = logging.getLogger("ray.serve")

@serve.deployment(name="fastapi-deployment", num_replicas=2)
@serve.ingress(fastapi)
class FastAPIDeployment:
    def __init__(self):
        self.counter = 0
        raise Exception("test")

    # FastAPI automatically parses the HTTP request.
    @fastapi.get("/hello")
    def say_hello(self, name: str) -> str:
        logger.info("Handling request!")
        return f"Hello {name}!"

my_app = FastAPIDeployment.bind()

```

2. ran the application from local cli
```
MAX_PER_REPLICA_RETRY_MULTIPLIER=1 serve run test:my_app
```

3. from the logs i can see that we are only retrying one instead of the
default `3`
https://gist.github.com/abrarsheikh/e85e00bb94ba443f76f77220b6ace530

since my app contain 2 replicas, the code retrying 2 * 1 times as
expected.

4. running without overriding the env variable `serve run test:my_app`
retries 6 times.

---------

Signed-off-by: Abrar Sheikh <abrar2002as@gmail.com>
Signed-off-by: Abrar Sheikh <abrar@abrar-FK79L5J97K.local>
Co-authored-by: Saihajpreet Singh <c-saihajpreet.singh@anyscale.com>
Co-authored-by: Abrar Sheikh <abrar@abrar-FK79L5J97K.local>
crypdick pushed a commit that referenced this issue Mar 4, 2025
## Why are these changes needed?

Make deployment retry count configurable through environment variable

## Related issue number

This PR addresses #5071

Since i did not find any references to this behavior in the public doc,
decided not to update any `docs`, let me know if that's not true.

- Testing Strategy
### updated unit tests

### manual test

1. create a simple application
```
import logging
import requests
from fastapi import FastAPI
from ray import serve

fastapi = FastAPI()
logger = logging.getLogger("ray.serve")

@serve.deployment(name="fastapi-deployment", num_replicas=2)
@serve.ingress(fastapi)
class FastAPIDeployment:
    def __init__(self):
        self.counter = 0
        raise Exception("test")

    # FastAPI automatically parses the HTTP request.
    @fastapi.get("/hello")
    def say_hello(self, name: str) -> str:
        logger.info("Handling request!")
        return f"Hello {name}!"

my_app = FastAPIDeployment.bind()

```

2. ran the application from local cli
```
MAX_PER_REPLICA_RETRY_MULTIPLIER=1 serve run test:my_app
```

3. from the logs i can see that we are only retrying one instead of the
default `3`
https://gist.github.com/abrarsheikh/e85e00bb94ba443f76f77220b6ace530

since my app contain 2 replicas, the code retrying 2 * 1 times as
expected.

4. running without overriding the env variable `serve run test:my_app`
retries 6 times.

---------

Signed-off-by: Abrar Sheikh <abrar2002as@gmail.com>
Signed-off-by: Abrar Sheikh <abrar@abrar-FK79L5J97K.local>
Co-authored-by: Saihajpreet Singh <c-saihajpreet.singh@anyscale.com>
Co-authored-by: Abrar Sheikh <abrar@abrar-FK79L5J97K.local>
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this issue Mar 4, 2025
## Why are these changes needed?

Make deployment retry count configurable through environment variable

## Related issue number

This PR addresses ray-project#5071 

Since i did not find any references to this behavior in the public doc,
decided not to update any `docs`, let me know if that's not true.

- Testing Strategy
### updated unit tests

### manual test

1. create a simple application
```
import logging
import requests
from fastapi import FastAPI
from ray import serve

fastapi = FastAPI()
logger = logging.getLogger("ray.serve")

@serve.deployment(name="fastapi-deployment", num_replicas=2)
@serve.ingress(fastapi)
class FastAPIDeployment:
    def __init__(self):
        self.counter = 0
        raise Exception("test")

    # FastAPI automatically parses the HTTP request.
    @fastapi.get("/hello")
    def say_hello(self, name: str) -> str:
        logger.info("Handling request!")
        return f"Hello {name}!"

my_app = FastAPIDeployment.bind()

```

2. ran the application from local cli
```
MAX_PER_REPLICA_RETRY_MULTIPLIER=1 serve run test:my_app
```

3. from the logs i can see that we are only retrying one instead of the
default `3`
https://gist.github.com/abrarsheikh/e85e00bb94ba443f76f77220b6ace530

since my app contain 2 replicas, the code retrying 2 * 1 times as
expected.

4. running without overriding the env variable `serve run test:my_app`
retries 6 times.

---------

Signed-off-by: Abrar Sheikh <abrar2002as@gmail.com>
Signed-off-by: Abrar Sheikh <abrar@abrar-FK79L5J97K.local>
Co-authored-by: Saihajpreet Singh <c-saihajpreet.singh@anyscale.com>
Co-authored-by: Abrar Sheikh <abrar@abrar-FK79L5J97K.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Just a question :) stale The issue is stale. It will be closed within 7 days unless there are further conversation
Projects
None yet
Development

No branches or pull requests

2 participants