-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] JAXPolicy (working discrete-actions PPO prototype). #13014
Conversation
rllib/agents/a3c/a3c_torch_policy.py
Outdated
@@ -84,8 +84,9 @@ def _value(self, obs): | |||
return self.model.value_function()[0] | |||
|
|||
|
|||
A3CTorchPolicy = build_torch_policy( | |||
A3CTorchPolicy = build_policy_class( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do these kind of unrelated changes in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, review this one here first before proceeding: #13091
…policy # Conflicts: # rllib/BUILD # rllib/agents/ddpg/ddpg_torch_model.py # rllib/agents/maml/maml_tf_policy.py # rllib/policy/policy_template.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be cool to also add JAX icon to the documentation (rllib.rst, rllib-algorithms.rst, rllib-toc.rst)
…policy # Conflicts: # rllib/utils/framework.py
A follow up PR will have documentation updates, including algo table with JAX sigils. |
…policy # Conflicts: # rllib/BUILD # rllib/policy/torch_policy.py
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
|
Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message. Please feel free to reopen or open a new issue if you'd still like it to be addressed. Again, you can always ask for help on our discussion forum or Ray's public slack channel. Thanks again for opening the issue! |
NOTE: Merge this PR here first for separation of concerns: #13091
JAXPolicy (working discrete-actions PPO prototype).
This PR adds:
Why are these changes needed?
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.