-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIL #158
Comments
Hi @qgallouedec - I haven't don't much testing but if there's no rush I'd love to work on this in my spare time. The official implementation appears to be here: https://github.com/google-research/google-research/tree/master/sail_rl |
There is no rush at all :) |
Hey everyone, I have tried the code what @emrul pasted in the IQN PR comments, it works. One thing what I haven't got to work is the SubProcEenv wrapping. Just wanted to let you know. :) |
Thanks @richardjozsa - that's interesting because I exclusively use SubProcVecEnv for training and the Dummy vec env for evaluation. What happens when you use SubProcVecEnv? |
This is the error what I got, but if it works for you than I recheck. I use a customenv maybe that caused something. Traceback (most recent call last): |
... looks like an error trying to load your env from Pickle but in my modifications I don't make any changes to envs (the replay buffer holds the SAIL returns internally) so I don't think this should be caused by amendments. |
My bad sorry, it was in my environment, it works fine. Only comment, you have set the replay buffer to device= cpu. I guess that can be auto. :) |
Great, and yes - good catch on the device, I will correct that! |
Self Imitation Learning
@emrul has implemented SAIL, see #139 (comment)
@emrul, is there an official implementation for those two? Do you match the results from the paper with your implementation?
The text was updated successfully, but these errors were encountered: