Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] POC: Run RLlib w/o Preprocessors setup. #17656

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
8841ad8
wip
sven1977 Jul 31, 2021
122bcbc
wip
sven1977 Aug 1, 2021
c9f7769
Merge branch 'master' of https://github.com/ray-project/ray into samp…
sven1977 Aug 1, 2021
bbc806b
wip
sven1977 Aug 1, 2021
54394cb
Merge branch 'sample_batch_supports_complex_spaces' into deprecate_pr…
sven1977 Aug 1, 2021
7b9c86e
wip
sven1977 Aug 2, 2021
8b50495
Merge branch 'master' of https://github.com/ray-project/ray into samp…
sven1977 Aug 2, 2021
1b52cd8
wip.
sven1977 Aug 2, 2021
674fb23
wip.
sven1977 Aug 2, 2021
88c8e95
fix.
sven1977 Aug 3, 2021
59646fd
Merge branch 'master' into deprecate_preprocessors_soft
sven1977 Aug 3, 2021
88408f9
Merge branch 'sample_batch_supports_complex_spaces' into deprecate_pr…
sven1977 Aug 3, 2021
a4b6458
Merge branch 'master' of https://github.com/ray-project/ray into samp…
sven1977 Aug 3, 2021
da1fef7
Merge branch 'master' of https://github.com/ray-project/ray into samp…
sven1977 Aug 4, 2021
68e81fe
wip.
sven1977 Aug 4, 2021
dd858c6
Merge branch 'master' of https://github.com/ray-project/ray into samp…
sven1977 Aug 6, 2021
90b3735
wip.
sven1977 Aug 6, 2021
1cc1c87
wip.
sven1977 Aug 6, 2021
941c0e0
Merge branch 'master' of https://github.com/ray-project/ray into samp…
sven1977 Aug 6, 2021
4ebcdad
wip.
sven1977 Aug 6, 2021
40bb3d3
wip.
sven1977 Aug 6, 2021
8aa3015
wip.
sven1977 Aug 6, 2021
5e1a769
Merge branch 'sample_batch_supports_complex_spaces' into deprecate_pr…
sven1977 Aug 6, 2021
9f787dd
wip.
sven1977 Aug 7, 2021
3d719d0
wip and LINT.
sven1977 Aug 12, 2021
a40667c
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 12, 2021
97182ab
fix.
sven1977 Aug 13, 2021
c664afa
fix.
sven1977 Aug 13, 2021
e810da5
fixes.
sven1977 Aug 14, 2021
f62abc1
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 14, 2021
2980346
fixes.
sven1977 Aug 15, 2021
1796934
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 15, 2021
481ed04
wip.
sven1977 Aug 16, 2021
a72a7c0
wip.
sven1977 Aug 18, 2021
7a4c678
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 19, 2021
5de2cae
wip.
sven1977 Aug 19, 2021
a2d4069
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 19, 2021
4e88450
wip.
sven1977 Aug 19, 2021
d911780
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 20, 2021
f44a6c3
fixes
sven1977 Aug 20, 2021
3d7c37d
fixes
sven1977 Aug 20, 2021
d9f0af9
fixes.
sven1977 Aug 20, 2021
4e8da9e
fix.
sven1977 Aug 20, 2021
59fa8a4
Add "env_id" and "t" to SampleBatch as consts.
sven1977 Aug 20, 2021
cce8ccb
Merge branch 'master' of https://github.com/ray-project/ray into seq_…
sven1977 Aug 20, 2021
78e1472
Fix.
sven1977 Aug 20, 2021
2253d58
Merge branch 'seq_lens_as_sample_batch_constant' into deprecate_prepr…
sven1977 Aug 20, 2021
8fcb9cf
wip.
sven1977 Aug 20, 2021
392dd1e
merge
sven1977 Aug 20, 2021
c3d9d5a
LINT.
sven1977 Aug 20, 2021
2a80b9d
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 20, 2021
a4e9744
LINT.
sven1977 Aug 20, 2021
7959e64
wip.
sven1977 Aug 20, 2021
cf3c9dc
wip.
sven1977 Aug 20, 2021
9a3cfb8
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 21, 2021
90083b6
fix.
sven1977 Aug 21, 2021
405e0ce
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 21, 2021
4825e27
wip.
sven1977 Aug 21, 2021
e928f73
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 23, 2021
6c0ad15
wip.
sven1977 Aug 23, 2021
3b4f644
wip.
sven1977 Aug 23, 2021
956fc88
Merge branch 'master' of https://github.com/ray-project/ray into depr…
sven1977 Aug 27, 2021
60aa649
wip
sven1977 Aug 27, 2021
6d822db
wip
sven1977 Aug 27, 2021
29695e0
wip
sven1977 Aug 27, 2021
fb50d81
wip
sven1977 Aug 27, 2021
fa8f54f
wip
sven1977 Aug 27, 2021
d294729
wip
sven1977 Aug 27, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions rllib/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -2354,6 +2354,24 @@ py_test(
srcs = ["examples/pettingzoo_env.py"],
)

py_test(
name = "examples/preprocessing_disabled_tf",
main = "examples/preprocessing_disabled.py",
tags = ["team:ml", "examples", "examples_P"],
size = "medium",
srcs = ["examples/preprocessing_disabled.py"],
args = ["--stop-iters=2"]
)

py_test(
name = "examples/preprocessing_disabled_torch",
main = "examples/preprocessing_disabled.py",
tags = ["team:ml", "examples", "examples_P"],
size = "medium",
srcs = ["examples/preprocessing_disabled.py"],
args = ["--framework=torch", "--stop-iters=2"]
)

py_test(
name = "examples/remote_envs_with_inference_done_on_main_node_tf",
main = "examples/remote_envs_with_inference_done_on_main_node.py",
Expand Down
2 changes: 2 additions & 0 deletions rllib/agents/dqn/dqn_tf_policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -416,6 +416,8 @@ def postprocess_nstep_and_prio(policy: Policy,
batch[SampleBatch.REWARDS], batch[SampleBatch.NEXT_OBS],
batch[SampleBatch.DONES])

# Create dummy prio-weights (1.0) in case we don't have any in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prob should be in diff PR

# the batch.
if PRIO_WEIGHTS not in batch:
batch[PRIO_WEIGHTS] = np.ones_like(batch[SampleBatch.REWARDS])

Expand Down
16 changes: 12 additions & 4 deletions rllib/agents/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,15 +185,17 @@
# Tuple[value1, value2]: Clip at value1 and value2.
"clip_rewards": None,
# If True, RLlib will learn entirely inside a normalized action space
# (0.0 centered with small stddev; only affecting Box components) and
# only unsquash actions (and clip just in case) to the bounds of
# env's action space before sending actions back to the env.
# (0.0 centered with small stddev; only affecting Box components).
# We will unsquash actions (and clip, just in case) to the bounds of
# the env's action space before sending actions back to the env.
"normalize_actions": True,
# If True, RLlib will clip actions according to the env's bounds
# before sending them back to the env.
# TODO: (sven) This option should be obsoleted and always be False.
"clip_actions": False,
# Whether to use "rllib" or "deepmind" preprocessors by default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe describe what rllib or deepmind does in the comment.

# Set to None for using no preprocessor. In this case, the model will have
# to handle possibly complex observations from the environment.
"preprocessor_pref": "deepmind",

# === Debug Settings ===
Expand Down Expand Up @@ -1001,7 +1003,7 @@ def compute_single_action(

# Check the preprocessor and preprocess, if necessary.
pp = local_worker.preprocessors[policy_id]
if type(pp).__name__ != "NoPreprocessor":
if pp and type(pp).__name__ != "NoPreprocessor":
observation = pp.transform(observation)
filtered_observation = local_worker.filters[policy_id](
observation, update=False)
Expand Down Expand Up @@ -1474,6 +1476,12 @@ def _validate_config(config: PartialTrainerConfigDict,
config["input_evaluation"]))

# Check model config.
# If no preprocessing, propagate into model's config as well
# (so model will know, whether inputs are preprocessed or not).
if config["preprocessor_pref"] is None:
model_config["_no_preprocessor"] = True

# Prev_a/r settings.
prev_a_r = model_config.get("lstm_use_prev_action_reward",
DEPRECATED_VALUE)
if prev_a_r != DEPRECATED_VALUE:
Expand Down
2 changes: 1 addition & 1 deletion rllib/env/multi_agent_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ def make_multi_agent(env_name_or_creator):
Returns:
Type[MultiAgentEnv]: New MultiAgentEnv class to be used as env.
The constructor takes a config dict with `num_agents` key
(default=1). The reset of the config dict will be passed on to the
(default=1). The rest of the config dict will be passed on to the
underlying single-agent env's constructor.

Examples:
Expand Down
Loading