Uniform random agent not moving in AntGather environment #3

nikhilrayaprolu · 2021-08-27T13:25:09Z

I tried the example code provided in the README. the agent is not making any move from the initial position it was dropped, except it jitters its body a little.

import hrl_pybullet_envs
import gym
import numpy as np

env = gym.make('AntGatherBulletEnv-v0')
env.render()
ob = env.reset()
tot_rew = 0

for i in range(1000):
  # Take random actions
  ob, rew, done, _ = env.step(np.random.uniform(-1, 1, env.action_space.shape))
  tot_rew += rew

  if done: break

print(f'Achieved total reward of: {tot_rew}')

The text was updated successfully, but these errors were encountered:

sash-a · 2021-08-27T14:19:04Z

That's expected, it's performing random actions each time step, so on average it would likely stay in the same spot

nikhilrayaprolu · 2021-08-28T05:03:02Z

But even in case of the PointGatherEnv there is no movement in the red cube on executing random actions

sash-a · 2021-08-28T09:14:22Z

Change

env.step(np.random.uniform(-1, 1, env.action_space.shape))

to

env.step(np.ones(env.action_space.shape))

nikhilrayaprolu · 2021-08-29T12:10:09Z

Thanks, @sash-a Can you also provide PointMazeEnv?

nikhilrayaprolu · 2021-08-29T12:11:07Z

Also, some examples with a trained example would help in the provided Colab. Probably running a Stable baselines agent might be enough:
https://stable-baselines3.readthedocs.io/en/master/

sash-a · 2021-08-29T12:26:51Z

PointMazeEnv may come in the future, but unfortunately I have pressing deadlines at the moment and it is not at the top of my list of envs to implement, AntPush and AntFall will likely come first.

The point of these environments is that they generally require hierarchical reinforcement learning to solve, so stable baselines likely would not cut it. Regardless that is beyond the scope of this repository, at the moment it is simply for my own research as I could not find non-mujoco versions of these envs and anyone that wants to use these envs is welcome, but I don't really have enough time to create baselines. You are more than welcome to try running some standard RL algorithms on these envs and see if they work and make some contributions like a benchmark.md, it would be much appreciated.

As a side not I did at one point try PPO on AntGather and it got a reward of 0, but that was a couple months ago and a very quick test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uniform random agent not moving in AntGather environment #3

Uniform random agent not moving in AntGather environment #3

nikhilrayaprolu commented Aug 27, 2021

sash-a commented Aug 27, 2021

nikhilrayaprolu commented Aug 28, 2021

sash-a commented Aug 28, 2021

nikhilrayaprolu commented Aug 29, 2021

nikhilrayaprolu commented Aug 29, 2021

sash-a commented Aug 29, 2021

Uniform random agent not moving in AntGather environment #3

Uniform random agent not moving in AntGather environment #3

Comments

nikhilrayaprolu commented Aug 27, 2021

sash-a commented Aug 27, 2021

nikhilrayaprolu commented Aug 28, 2021

sash-a commented Aug 28, 2021

nikhilrayaprolu commented Aug 29, 2021

nikhilrayaprolu commented Aug 29, 2021

sash-a commented Aug 29, 2021