Single-file SAC-N [1] implementation on jax with both flax and equinox. 10x faster than SAC-N on pytorch from CORL [2].
And still easy to use and understand! To run:
python sac_n_jax_flax.py --env_name="halfcheetah-medium-v2" --num_critics=10 --batch_size=256
python sac_n_jax_eqx.py --env_name="halfcheetah-medium-v2" --num_critics=10 --batch_size=256
Optionally, you can pass --config_path
to the yaml file, for more see pyrallis docs.
Main insight here is to jit epoch loop also with jax.lax.fori_loop
or jax.lax.scan
, not just one update of the networks, as it is usually done (jaxrl2 for instance). With jitting the update only speedup will be approx 1.5x here.
Both runs were trained on same V100 GPU.