[RLlib] RLTrainer is all you need. #31490

kouroshHakha · 2023-01-06T03:11:06Z

Signed-off-by: Kourosh Hakhamaneshi kourosh@anyscale.com

Why are these changes needed?

The RLOptimizer seems to have become this shallow module that just adds to the complexity of the system.
It basically is responsible for two things: 1) defining framework optimizers, 2) containing the loss logic.

defining the optimizer inside this module creates a lot of un-necessary complexities when it comes to multi-agent RLOptimizers. By moving this logic to RLTrainer we can get rid of these complexities.

Since compute_loss is also a state-less function we can easily move that to RLTrainer as well. Now all users have to do is to extend RLTrainer directly to customize the training phase of their algorithm. For example BCOptimizer will now become part of BCRLTrainer's implementation where all I have to do is to optionally override the _configure_optimizer() and write compute_loss. compute_loss will be written as a multi-agent loss. This is where the first-class-ness of MARL comes into play. It will make very complicated MARL communication patterns possible and also extremely easy.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

2. multi-gpus tests pass now Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

avnishn · 2023-01-06T20:18:13Z

rllib/core/rl_trainer/rl_trainer.py

+        # rerun make_optimizers to update the params and optimizer
+        self.make_optimizers()
+
+    def make_module(self) -> RLModule:


returns a MultiAgentRLModule

avnishn

lgtm comments and type hints being fixed

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha · 2023-01-06T21:35:25Z

rllib/core/optim/tests/test_rl_optimizer_tf.py

@@ -5,19 +5,13 @@
 import unittest

 import ray


You can ignore everything that is under optim. Since these tests are removed from CI anyway.

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

gjoliver · 2023-01-06T23:45:20Z

Test failures do not seem relevant

moved rl_optimizer logic into rl_trainer Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

moved rl_optimizer logic into rl_trainer Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: tmynn <hovhannes.tamoyan@gmail.com>

moved rl_optimizer logic into rl_trainer

d8571fe

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha requested review from sven1977, gjoliver, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla and krfricke as code owners January 6, 2023 03:11

kouroshHakha added 8 commits January 5, 2023 19:14

wip

d7a6a24

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

wip

888b226

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

wip

e518e15

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

1. added in_test to RLTrainer to allow doing test-specific stuff

f5416b1

2. multi-gpus tests pass now Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

moved the dataset reader logic into a test_util method

6286208

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

all multi-gpu unittests are now passing

4583da1

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

updated docstrings

05dfe6c

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

docstrings

705c6ea

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

avnishn reviewed Jan 6, 2023

View reviewed changes

avnishn approved these changes Jan 6, 2023

View reviewed changes

kouroshHakha added 2 commits January 6, 2023 13:15

wip

6ba746c

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

removed optimizers from the ci test suit

2625430

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha commented Jan 6, 2023

View reviewed changes

kouroshHakha added 2 commits January 6, 2023 13:38

revert rllib prefix

131d4c9

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

skipping rl_optimizer tests

7aed1ca

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

gjoliver merged commit 4e234b7 into ray-project:master Jan 6, 2023

AmeerHajAli pushed a commit that referenced this pull request Jan 12, 2023

[RLlib] RLTrainer is all you need. (#31490)

4a00526

moved rl_optimizer logic into rl_trainer Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] RLTrainer is all you need. #31490

[RLlib] RLTrainer is all you need. #31490

kouroshHakha commented Jan 6, 2023 •

edited

Loading

avnishn Jan 6, 2023

avnishn left a comment

kouroshHakha Jan 6, 2023

gjoliver commented Jan 6, 2023

[RLlib] RLTrainer is all you need. #31490

[RLlib] RLTrainer is all you need. #31490

Conversation

kouroshHakha commented Jan 6, 2023 • edited Loading

Why are these changes needed?

Related issue number

Checks

avnishn Jan 6, 2023

Choose a reason for hiding this comment

avnishn left a comment

Choose a reason for hiding this comment

kouroshHakha Jan 6, 2023

Choose a reason for hiding this comment

gjoliver commented Jan 6, 2023

kouroshHakha commented Jan 6, 2023 •

edited

Loading