-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyperparameters for biped walk #24
Comments
I have been training a bipedal robot as well. |
That's very interesting, thank you @WoohyunCha ! I think I may still have some bugs, even when I keep only the amp reward (no velocity tracking) and only the forward walk motion example, the robot has trouble learning something clean. There is shaking and it's not really a regular walking gait. If I am not mistaken, this parameter is not "parameterized" in AMP_for_hardware, so I dug into the code and I think I found it, so I parameterized it to play with it. I think it is this (in @WoohyunCha Did you tune this parameter as well or did you keep it at 10 ? In this repo https://github.com/inspirai/MetalHead they seem to use a very low Also, I record my motion examples at 60fps. My understanding is that |
Ok, I'll try introducing the Thank you for your valuable help @WoohyunCha ! |
Also, my colleague says including the b_loss from IsaacgymEnvs is important so maybe take a look into it as well. |
I'll look into the b_loss ! Yeah I'll report back there, it could be useful for other people too :) |
So I have been playing with This makes me think there is a bug somewhere, or that some of the parameters are much too far from what they should be. @WoohyunCha @Alescontrela any ideas ? |
I added clipping for the observations (5) and actions (1) like in IsaacGymEnvs, It seems to have a drastic effect on learning behavior. I'm starting to see things I like :) amp_for_hardware-2024-08-07_22.57.30.mp4This is with only motion imitation ( |
Congrats @apirrone !! Also, about clipping the observations and actions, |
So I've been messing around all day, no luck. Now, I'm trying to match all the parameters I had in IsaacGymEnvs, but I don't know if it's such a good idea since the a1_amp example of this repo works just fine ... If anyone has any idea of things I could try, that would be great. I'm a little out of ideas 😅 |
That's a great result @WoohyunCha, congrats ! I have tried with pretty much the exact same parameters as in your screenshot, it wouldn't imitate the motion properly. I generate the reference motion using a procedural walk engine, here is an example amp_for_hardware-2024-08-09_10.02.39.mp4I know these motions are somewhat realistic, as I can make the robot walk in mujoco using them amp_for_hardware-2024-08-09_10.04.27.mp4I also know I formatted them correctly for AMP_for_hardware's amp_for_hardware-2024-08-06_15.55.33.mp4For now, I try to only make it walk forward properly, so my velocity command is always 0.1 m/s (I looked at the avereage linear velocity in mujoco when walking forward) I matched the actual torque of the motors I'm using in the real robot (given by the specs), they are not very powerful. I was thinking maybe the robot is simply under actuated and it's hard to follow this walk properly, but I was able to learn a very clean walk with these specs in IsaacGymEnvs so I don't think it's that isaac_walk-2024-07-16_17.52.32.mp4I'll keep trying here, but at some point I will probably go back to IsaacGymEnv. |
@WoohyunCha can you share your reward scale? |
I used 50 for linear velocity tracking and 25 for angular velocity tracking. |
@apirrone in resample_commands function velocities smaller than 0.2 are set to zero, so I assume you are always feeding 0 command to the policy |
Hi @kaydx1 , yes I noticed that recently, I changed that to 0.01, but still no luck. Thank you for your help, don't hesitate if you have another idea :) |
@apirrone _get_noise_scale_vec check this function as well if it coresponds to your action space(if you are using noise). |
I removed all noise and randomization for now, trying to debug the root of the issue. But at first I had issues with base mass randomization, which would add or remove up to 1kg, and my robot is 1kg :) |
@kaydx1 @WoohyunCha Do you think I could have simulation stability issues because of the low weight of my robot ? I set the max torque the same as given by the specs of the real motors I use (0.52Nm), but I have trouble finding relevant stiffness and damping parameters. Also, I tried implementing the same custom PD control as in AMP_for_hardware in IsaacGymEnvs, and I have similar issues. With isaac gym's PD controller (using So I guess my issues could come from simulation stability (dt ? decimation? )or control parameters ? |
@apirrone You could try manually investigate outputs of both position and torque controlled policies and torques you are sending in compute_torques function as well. Maybe it will give some insights(if your torque always hits the limit or if it too small for example). And for sure you can try bigger action scale(0.5 or 1). And maybe increase clipping parameter of action and observation. Regarding stability if your sim.dt is small enough(0.005 for example) it shouldn't be the case. |
Good idea, I'll try that |
Do you know how to get the computed torques out of the default pd controller ? I did this : self.gym.refresh_dof_force_tensor(self.sim)
torques = self.gym.acquire_dof_force_tensor(self.sim)
torques = gymtorch.wrap_tensor(torques).view(self.num_envs, self.num_dof) But I think it returns the net torques applied on each joints, and in the case of a "working" walk, the torques are mostly compensated by gravity, which gives this Here, custom is with the custom PD controller that uses |
@apirrone No, dont have experience with gym.set_dof_position_target_tensor(). So that means that you learn right motor position commands but when you compute the torque from these positions it doesn't work? Right now have no ideas |
My understanding is that when the robot is standing up, the policy learned to output actions such that the torques sent to the motors compensate the torques created by gravity (with the legs flexed), so that the total sum of torques is close to zero. I think that is what I don't know how I can get the torques that are applied to the motors by the policy when using isaac's pd control |
@apirrone I also forgot about default_dof_drive_mode in asset config? Did you check it? |
I set it to None, by default it was effort, does this make a difference ? |
@apirrone https://forums.developer.nvidia.com/t/difference-between-set-dof-actuation-force-tensor-and-set-actor-dof-position-targets/271432/4 |
@apirrone Very cool robot! Can't wait to see it walking properly. Are you entirely sure that the reference motion is correct? Not just the join angles, but also the velocities / joint velocities. Also the mapping between reference motion joints and robot joints is crucial. Here is how I would recommend debugging that:
If they don't line up then there's a good chance something is wrong @WoohyunCha That motion looks awesome! |
Hi @Alescontrela ! I'm pretty sure the reference motion is correct, I did not plot it but I printed the difference between the data and the actual motion in Isaac in And the difference is 0 In my understanding, this means that the mapping is correct too, right ? A little update, this is where I'm at now : bdx_amp_for_hardware-2024-08-17_10.44.04.mp4I tuned the control parameters better, so there is way less shaking, which is good :) But as you can see, the robots are still not really walking ^^ Thank you for your help ! |
So I'm still investigating. bdx_amp_for_hardware-2024-08-19_17.41.50.mp4Looking at the dof position targets (motion reference) and actual dof positions: Looks pretty good. Now, the dof velocities (in rad/s): This looks very wrong right ? For reference, this is how I get these values : target_dof_vel = self.amp_loader.get_joint_vel_batch(
self.amp_loader.get_full_frame_at_time_batch(
np.zeros(self.num_envs, dtype=np.int),
self.envs_times.cpu().numpy().flatten(),
)
)
target_dof_vel_np = torch.round(target_dof_vel, decimals=4).cpu().numpy()
actual_dof_vel_np = torch.round(self.dof_vel, decimals=4).cpu().numpy() Also, I noticed a while back that the velocities shown in the graph when using For a run that looks like this bdx_amp_for_hardware-2024-08-19_17.45.47.mp4Does anyone know what could be going on ? |
Is the video below from using the custom PD controller instead of the one in Isaacgym? If so, could it not be simply a matter of gain tuning? |
@WoohyunCha Yeah this is a random run, a few training steps, it was just to demonstrate the velocity noise. It is using the custom PD controller. Maybe it's just a matter of gain tuning, but I spent some time tuning those gains. In the motion you see in the first video, the motion is replayed through the actions, meaning through the PD controller. The motion is correctly reproduced and as you can see in the position graphs, the dofs follow the commands pretty well. So I don't think the gains are too much far off, right ? |
So both the videos use custom PD controllers, and the only difference is whether the base is fixed or not? If so, I think it is because the gains will work differently when the robot is under contact (with the ground). Have you tuned the gains while the robot base is fixed? |
Yes I have tuned the motor gains with the base fixed, to see if it could reproduce the reference motion. I could try tuning the gains with the open loop motion with the base not fixed, but it's hard because the robot immediately falls down with the open loop motion. I'll see if I can make it work |
I have spent some time tuning the kP and kD parameters, I had to reduce the dt to 0.002 (instead of 0.005) to get something stable. With zero actions, the robot stands on its own without shaking, and replaying the motion it seems to follow the commands pretty well (it's falling because it's open loop, but it looks ok). With bdx_amp_for_hardware-2024-08-21_09.38.41.mp4I think because of the low weight and inertia of my robot, the physics solver introduced a lot of noise with a higher dt, and because the kP and kD parameters were not optimal, the policy could not learn to properly follow the movements. Does this make sense ? |
Great to hear that! |
Hi @WoohyunCha ! This is the best I got as of today bdx_amp_for_hardware-2024-08-21_17.06.01.mp4That's a big step forward, but still not perfect, too much shaking, the walkk does not look very robust etc. So I'm still playing with parameters :) I have been using dt = 0.002 and decimation 6. I'll try increasing it to 8 to see if it helps things. |
Hi, I've been working on the parameters a lot and found out that setting disc_coef = 1 and disc_grad_penalty as 5, which is closer to the parameters in the original framework, works best. I guess trying to match the parameters with IsaacgymEnvs was a bad idea.
|
Thank you so much @WoohyunCha, I'll try with these parameters and see if it improves things ! Were you able to introduce velocity following for turning ? I tried adding it along with the relevant reference motion but it did not really work. I only tried once tho |
Haven't tried it yet, but my colleague did it in IsaacgymEnvs so I guess it should work in leggedgym as well |
Hello !
I'm trying to use AMP_for_hardware to learn a nice walk on a small bipedal robot. I have produced motion example data in the same format as you used here.
I tried running trainings with pretty much the same hyperparameters as the ones in your example
a1_amp_config.py
. I had to tweak some things to match the size and weight of my own robot, but not a lot.To validate that there is no bug in my adaptations, I first train with only a walking forward example motion and only positive x linear velocity command and no noise/randomization. The robot progressively learns to move forward a little bit, but by shaking its feet and making small jumps, not imitating the walking example motion. Here is an example (after 1000 steps of 8000 envs)
amp_for_hardware-2024-08-06_15.53.49.mp4
And here is what the reference motion looks like (using your
replay_amp_data.py
script)amp_for_hardware-2024-08-06_15.55.33.mp4
The training curves look like this :
I am able to run your
a1_amp
environment with the provided checkpoint and it runs great, it's very fun ton control it with a joystick :)My questions are :
a1_amp_config.py
to train the provided policy ? Meaning only the velocity tracking rewards andamp_reward_coef = 2.0
?Thank you very much !
The text was updated successfully, but these errors were encountered: