Reacher Task and AutoResetWrapper #452
esraaelelimy
started this conversation in
General
Replies: 1 comment
-
Hi @esraaelelimy , indeed AutoResetWrapper will cache the first_state, but the first_state is sampled with a different brax/brax/training/agents/ppo/train.py Lines 418 to 431 in f9a4d73 But we have not done in-depth analysis on some of these hyperparameters (i.e. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am looking to use Brax Reacher task as an alternative to Mujoco Reacher for some RL tasks, but I have some concerns:
In Mujoco Reacher task , if the fingertip reaches the target, a new random target appears. Also, at the beginning of each new episode, the target position changes. In Brax, I see that the target position is only generated when the environment is rested. Moreover, when using the Autoreset wrapper, at the reset, it fetches the 'first state,' which means that the random target is generated once at the very beginning, and it never changes. Does this make the Brax version of Reacher easy to solve compared to Mujoco's Reacher? and how can we allow the Autoresetwrapper to actually change the target every reset without sacrificing the speed?
Beta Was this translation helpful? Give feedback.
All reactions