[BugFix] adapt log-prob TD batch-size to advantage shape in PPO #1857
Job | Run time |
---|---|
3m 14s | |
3m 14s | |
3m 24s | |
3m 24s | |
2m 31s | |
2m 31s | |
3m 20s | |
2m 45s | |
0s | |
0s | |
24m 23s |
Job | Run time |
---|---|
3m 14s | |
3m 14s | |
3m 24s | |
3m 24s | |
2m 31s | |
2m 31s | |
3m 20s | |
2m 45s | |
0s | |
0s | |
24m 23s |