rl-agents

A collection of RL agents in tensorflow 2.0

Usefull definitions

PPO

A good explanation of what this algorithm does is depicted in OpenAI’s spinning up docs: “whose updates indirectly maximize performance, by instead maximizing a surrogate objective function which gives a conservative estimate for how much \(J(π_θ)\) will change as a result of the update”

On-policy algorithms

Each update only uses data colected while acting under the most recent version of the policy.

Off-policy algorithms

Each update can use data recorded at any point during trainning, regardless of how the agent was exploring the environment at that time.

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
rl_agents		rl_agents
rl_agents_old		rl_agents_old
scripts		scripts
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.org		README.org
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rl-agents

Usefull definitions

PPO

On-policy algorithms

Off-policy algorithms

About

Releases

Packages

Contributors 2

Languages

License

Raiszo/rl-agents

Folders and files

Latest commit

History

Repository files navigation

rl-agents

Usefull definitions

PPO

On-policy algorithms

Off-policy algorithms

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages