Skip to content

nslyubaykin/relax_sac_example

Repository files navigation

Example SAC implementation with ReLAx

This repository contains an implementation of soft actor critic (SAC) with ReLAx.

SAC actor was trained on Hopper-v2 Mujoco Gym environment for 1m env-steps.

The graph of average return vs environment step is shown below (logs done every 10k steps):

sac_training

The distribution of estimated Q-values vs data Q-values is shown below:

sac_q_func

Resulting Policy:

sac_run.mp4