gradient on trajectory vs sum of gradients on 1 step transitions. #454

yecohn · 2024-01-18T19:11:33Z

yecohn
Jan 18, 2024

Hi, I have been experimenting with brax and I am currently interested by taking the derivative of a reward function wrt to friction parameter. the reward function is given by:

state_final - state_initial / dt * <some_constant>.

Interestingly I found that computing the gradient on 1 step transition and summing the gradients for the full trajectory is different from computing the full trajectory and then taking the gradient.

does somebody knows how jax computes gradients in this case ?

thanks!

btaba · 2024-02-11T21:10:30Z

btaba
Feb 11, 2024
Maintainer

Hi @yecohn , see more info on jax autodiff for computing gradients. Why are you expecting the gradient over the full trajectory to be the sum of gradients for each step transition? The step function gets applied recursively so you need to apply the chain rule to get the gradient over the full trajectory

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gradient on trajectory vs sum of gradients on 1 step transitions. #454

{{title}}

Replies: 1 comment

{{title}}

Select a reply

gradient on trajectory vs sum of gradients on 1 step transitions. #454

yecohn Jan 18, 2024

Replies: 1 comment

btaba Feb 11, 2024 Maintainer

yecohn
Jan 18, 2024

btaba
Feb 11, 2024
Maintainer