-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to disable preprocessing for a policy? #8600
Comments
Answer: you can make the state space for the hard-coded policy be a gym.spaces.Box regardless of whether the state space is actually continuous or not; rllib does not perform any preprocessing for Box spaces. This is a pretty hacky solution though, would be nice to see a cleaner one. |
Got you. Yes, setting preprocessing to off shouldn't even do one-hot'ing anymore. I'll take a look. |
Has this been addressed in a non-hacky way? I see a Maybe Line 1003 in 7916500
NOPE... Also tried
Hmm... it looks like this is being worked on, but not quite ready for prime time: |
In the end, the Just in case anyone finds it useful, here's a snippet that transforms boolean vector observations from import gym
import numpy as np
from ray import rllib
class BooleanVectorPreprocessor(rllib.models.preprocessors.Preprocessor):
def _init_shape(self, observation_space, options=None):
return (len(observation_space.spaces),)
def transform(self, observation):
return np.array(observation)
@property
def observation_space(self):
space = gym.spaces.Box(0, 1, self.shape, dtype='int8')
space.original_space = self._obs_space
return space
rllib.models.ModelCatalog.register_custom_preprocessor('boolean_vector', BooleanVectorPreprocessor) |
Thanks for adding to this @andras-kth ! |
What is your question?
I have a few learned policies as well as one hard-coded policy. I instantiated the hard-coded policy using the Policy interface. I'd like to know how to disable the observation preprocessing for this policy only; couldn't find anything for this in the docs.
For instance, in the rock-paper-scissors example, the observations are automatically converted to one-hot-encodings. https://github.com/ray-project/ray/blob/master/rllib/examples/policy/rock_paper_scissors_dummies.py
But I'd like to be able to operate directly on the states 0,1,2 for rock, paper scissors.
The text was updated successfully, but these errors were encountered: