[RLlib] POC: Run RLlib w/o Preprocessors setup. #17656

sven1977 · 2021-08-07T00:54:44Z

This PR was motivated in preparation for soon allowing individual observation components to be addressed by the trajectory view API, for example to enable frame-stacking for individual observation components within a complex observation space (Tuple|Dict). Also, soon soft-deprecating RLlib's Preprocessor API should increase transparency for the users and allow batched, model-based preprocessing of observations. Observations will arrive in the model exactly as they are returned by the env.

This PR is a POC that works for tf and torch.

New config setting for preprocessor_pref: None; Set to None for disabling Preprocessors altogether (not even use NoPreprocessor class anymore).
A new example script was added (also to BUILD) to show the new setting in action with both tf and torch's ComplexInputModel (which already exists and handles complex, unflattened inputs).
Most changes were necessary within the SimpleListCollector (SampleCollector) class allowing for arbitrarily nested data to be collected from the env and stored/retrieved.
Works in unison with the recent SampleBatch enhancement ([RLlib] SampleBatch: Docstring- and API cleanups; Add support for nested data. #17485) for allowing nested data within a SampleBatch.

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…le_batch_supports_complex_spaces

…eprocessors_soft # Conflicts: # rllib/policy/sample_batch.py

…le_batch_supports_complex_spaces

…eprocessors_soft

…le_batch_supports_complex_spaces # Conflicts: # rllib/policy/sample_batch.py

…le_batch_supports_complex_spaces

…eprocessors_soft # Conflicts: # rllib/agents/trainer.py # rllib/utils/annotations.py

…ecate_preprocessors_soft # Conflicts: # rllib/agents/trainer.py

…ecate_preprocessors_soft # Conflicts: # rllib/evaluation/collectors/simple_list_collector.py

…ecate_preprocessors_soft

…ecate_preprocessors_soft # Conflicts: # rllib/evaluation/collectors/simple_list_collector.py

sven1977 · 2021-08-21T15:11:51Z

rllib/execution/replay_buffer.py

@@ -402,7 +402,7 @@ def add_batch(self, batch: SampleBatchType) -> None:
                        # If SampleBatch has prio-replay weights, average
                        # over these to use as a weight for the entire
                        # sequence.
-                        if "weights" in time_slice:
+                        if "weights" in time_slice and time_slice["weights"]:


Avoid np.mean over empty list (yields: NaN).

…ecate_preprocessors_soft

michaelzhiluo · 2021-09-01T08:49:06Z

rllib/agents/dqn/dqn_tf_policy.py

@@ -416,6 +416,8 @@ def postprocess_nstep_and_prio(policy: Policy,
                      batch[SampleBatch.REWARDS], batch[SampleBatch.NEXT_OBS],
                      batch[SampleBatch.DONES])

+    # Create dummy prio-weights (1.0) in case we don't have any in


Prob should be in diff PR

michaelzhiluo · 2021-09-01T08:55:45Z

rllib/agents/trainer.py

-    # env's action space before sending actions back to the env.
+    # (0.0 centered with small stddev; only affecting Box components).
+    # We will unsquash actions (and clip, just in case) to the bounds of
+    # the env's action space before sending actions back to the env.
    "normalize_actions": True,
    # If True, RLlib will clip actions according to the env's bounds
    # before sending them back to the env.
    # TODO: (sven) This option should be obsoleted and always be False.
    "clip_actions": False,
    # Whether to use "rllib" or "deepmind" preprocessors by default


Maybe describe what rllib or deepmind does in the comment.

michaelzhiluo · 2021-09-01T09:38:14Z

rllib/evaluation/collectors/simple_list_collector.py

+        self.buffers[SampleBatch.AGENT_INDEX][0].append(agent_index)
+        self.buffers[SampleBatch.ENV_ID][0].append(env_id)
+        self.buffers[SampleBatch.T][0].append(t)
+        self.buffers[SampleBatch.EPS_ID][0].append(self.episode_id)


If self.episode_id and unroll_id is constant, why repeat adding the same data?

michaelzhiluo · 2021-09-01T09:38:28Z

rllib/evaluation/collectors/simple_list_collector.py

        if SampleBatch.EPS_ID in values:
            assert values[SampleBatch.EPS_ID] == self.episode_id
            del values[SampleBatch.EPS_ID]
+        self.buffers[SampleBatch.EPS_ID][0].append(self.episode_id)


same issue above

michaelzhiluo · 2021-09-01T10:05:15Z

rllib/utils/tf_ops.py

+                                shape=shape,
+                                name=".".join([str(p) for p in path]),
+                            )
+


What does this do?

sven1977 added 24 commits July 31, 2021 19:02

wip

8841ad8

wip

122bcbc

Merge branch 'master' of https://github.com/ray-project/ray into samp…

c9f7769

…le_batch_supports_complex_spaces

wip

bbc806b

Merge branch 'sample_batch_supports_complex_spaces' into deprecate_pr…

54394cb

…eprocessors_soft # Conflicts: # rllib/policy/sample_batch.py

wip

7b9c86e

Merge branch 'master' of https://github.com/ray-project/ray into samp…

8b50495

…le_batch_supports_complex_spaces

wip.

1b52cd8

wip.

674fb23

fix.

88c8e95

Merge branch 'master' into deprecate_preprocessors_soft

59646fd

Merge branch 'sample_batch_supports_complex_spaces' into deprecate_pr…

88408f9

…eprocessors_soft

Merge branch 'master' of https://github.com/ray-project/ray into samp…

a4b6458

…le_batch_supports_complex_spaces # Conflicts: # rllib/policy/sample_batch.py

Merge branch 'master' of https://github.com/ray-project/ray into samp…

da1fef7

…le_batch_supports_complex_spaces

wip.

68e81fe

Merge branch 'master' of https://github.com/ray-project/ray into samp…

dd858c6

…le_batch_supports_complex_spaces

wip.

90b3735

wip.

1cc1c87

Merge branch 'master' of https://github.com/ray-project/ray into samp…

941c0e0

…le_batch_supports_complex_spaces

wip.

4ebcdad

wip.

40bb3d3

wip.

8aa3015

Merge branch 'sample_batch_supports_complex_spaces' into deprecate_pr…

5e1a769

…eprocessors_soft # Conflicts: # rllib/agents/trainer.py # rllib/utils/annotations.py

wip.

9f787dd

sven1977 requested a review from michaelzhiluo August 7, 2021 01:00

sven1977 assigned michaelzhiluo Aug 7, 2021

sven1977 added 4 commits August 12, 2021 22:45

wip and LINT.

3d719d0

Merge branch 'master' of https://github.com/ray-project/ray into depr…

a40667c

…ecate_preprocessors_soft # Conflicts: # rllib/agents/trainer.py

fix.

97182ab

fix.

c664afa

sven1977 added 9 commits August 20, 2021 18:58

merge

392dd1e

LINT.

c3d9d5a

Merge branch 'master' of https://github.com/ray-project/ray into depr…

2a80b9d

…ecate_preprocessors_soft # Conflicts: # rllib/evaluation/collectors/simple_list_collector.py

LINT.

a4e9744

wip.

7959e64

wip.

cf3c9dc

Merge branch 'master' of https://github.com/ray-project/ray into depr…

9a3cfb8

…ecate_preprocessors_soft

fix.

90083b6

Merge branch 'master' of https://github.com/ray-project/ray into depr…

405e0ce

…ecate_preprocessors_soft # Conflicts: # rllib/evaluation/collectors/simple_list_collector.py

sven1977 commented Aug 21, 2021

View reviewed changes

sven1977 added 11 commits August 21, 2021 17:31

wip.

4825e27

Merge branch 'master' of https://github.com/ray-project/ray into depr…

e928f73

…ecate_preprocessors_soft

wip.

6c0ad15

wip.

3b4f644

Merge branch 'master' of https://github.com/ray-project/ray into depr…

956fc88

…ecate_preprocessors_soft

wip

60aa649

wip

6d822db

wip

29695e0

wip

fb50d81

wip

fa8f54f

wip

d294729

sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Aug 28, 2021

michaelzhiluo reviewed Sep 1, 2021

View reviewed changes

rllib/utils/tf_ops.py

shape=shape,

name=".".join([str(p) for p in path]),

)

Copy link

Contributor

michaelzhiluo Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

sven1977 closed this Sep 5, 2021

andras-kth mentioned this pull request Sep 10, 2021

How to disable preprocessing for a policy? #8600

Closed

sven1977 deleted the deprecate_preprocessors_soft branch June 2, 2023 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] POC: Run RLlib w/o Preprocessors setup. #17656

[RLlib] POC: Run RLlib w/o Preprocessors setup. #17656

sven1977 commented Aug 7, 2021 •

edited

Loading

sven1977 Aug 21, 2021

michaelzhiluo Sep 1, 2021

michaelzhiluo Sep 1, 2021

michaelzhiluo Sep 1, 2021

michaelzhiluo Sep 1, 2021

michaelzhiluo Sep 1, 2021

[RLlib] POC: Run RLlib w/o Preprocessors setup. #17656

[RLlib] POC: Run RLlib w/o Preprocessors setup. #17656

Conversation

sven1977 commented Aug 7, 2021 • edited Loading

Why are these changes needed?

Related issue number

Checks

sven1977 Aug 21, 2021

Choose a reason for hiding this comment

michaelzhiluo Sep 1, 2021

Choose a reason for hiding this comment

michaelzhiluo Sep 1, 2021

Choose a reason for hiding this comment

michaelzhiluo Sep 1, 2021

Choose a reason for hiding this comment

michaelzhiluo Sep 1, 2021

Choose a reason for hiding this comment

michaelzhiluo Sep 1, 2021

Choose a reason for hiding this comment

sven1977 commented Aug 7, 2021 •

edited

Loading