Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update tdmpc into kscale sim module. #72

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

chamorajg
Copy link
Collaborator

No description provided.

@budzianowski
Copy link
Collaborator

Do you have some results of standing?

Copy link
Member

@WT-MM WT-MM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - could you run make format + make static-checks to make sure that this pr passes the status checks?

@chamorajg
Copy link
Collaborator Author

chamorajg commented Sep 13, 2024

Do you have some results of standing?

I started training the stompypro only yesterday and its probably doing 100 iterations per day (takes ever so long to train). I have standing results of dora taken at 200 iterations (25M timesteps).
output

I have results of stompy pro learning to stand a bit but its at the very early stages of training.
stompypro

@budzianowski
Copy link
Collaborator

I see! What is the most time consuming step in the pipeline?

@chamorajg
Copy link
Collaborator Author

horizon update. This line makes the training process so slow.

@budzianowski
Copy link
Collaborator

Horizon is only 5 and mlp is tiny. I don't understand why would it be that slow?

@chamorajg
Copy link
Collaborator Author

The planning step that collects samples from interactions with the environment (MPC) is pretty slow. The number of iterations that we set for humanoid type task is around ~10-12.

Copy link
Collaborator

@budzianowski budzianowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see Makefile for linting and semi building. See https://github.com/kscalelabs/sim/blob/master/CONTRIBUTING.md

if tdmpc_cfg.save_model and episode_idx % tdmpc_cfg.eval_freq_episode == 0:
L.save(agent, f"tdmpc_policy_{int(step // tdmpc_cfg.episode_length) + 1}.pt")
buffer.save(str(work_dir / "buffer.pt"))
# # common_metrics['episode_reward'] = evaluate(env, agent, h1 if L.video is not None else None, tdmpc_cfg.eval_episodes, step, env_step, L.video, tdmpc_cfg.action_repeat)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants