JoyRL
is a parallel reinforcement learning library based on PyTorch and Ray. Unlike existing RL libraries, JoyRL
is helping users to release the burden of implementing algorithms with tough details, unfriendly APIs, and etc. JoyRL is designed for users to train and test RL algorithms with only hyperparameters configuration, which is mush easier for beginners to learn and use. Also, JoyRL supports plenties of state-of-art RL algorithms including RLHF(core of ChatGPT)(See algorithms below). JoyRL provides a modularized framework for users as well to customize their own algorithms and environments.
# you need to install Anaconda first
conda create -n joyrl python=3.10
conda activate joyrl
pip install -U joyrl
Torch install:
# CPU
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1
# CUDA 11.8
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
# CUDA 12.1
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
the following presents a demo to use joyrl. As you can see, first create a yaml file to config hyperparameters, then run the command as below in your terminal. That's all you need to do to train a DQN agent on CartPole-v1 environment.
joyrl --yaml ./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml
or you can run the following code in your python file.
import joyrl
if __name__ == "__main__":
print(joyrl.__version__)
yaml_path = "./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml"
joyrl.run(yaml_path = yaml_path)
More tutorials and API documentation are hosted on JoyRL docs or JoyRL 中文文档.
Name | Reference | Author | Notes |
---|---|---|---|
Q-learning | RL introduction | johnjim0816 | |
Sarsa | RL introduction | johnjim0816 | |
DQN | DQN Paper | johnjim0816 | |
Double DQN | DoubleDQN Paper | johnjim0816 | |
Dueling DQN | DuelingDQN Paper | johnjim0816 | |
NoisyDQN | NoisyDQN Paper | johnjim0816 | |
CategoricalDQN | CategoricalDQN Paper | johnjim0816 | |
DDPG | DDPG Paper | johnjim0816 | |
TD3 | TD3 Paper | johnjim0816 | |
A2C/A3C | A3C Paper | johnjim0816 | |
PPO | PPO Paper | johnjim0816 | |
SoftQ | SoftQ Paper | johnjim0816 |
RL Platform | GitHub Stars | # of Alg. (1) | Custom Env | Async Training | RNN Support | Multi-Head Observation | Backend |
---|---|---|---|---|---|---|---|
Baselines | 9 | ✔️ (gym) | ❌ | ✔️ | ❌ | TF1 | |
Stable-Baselines | 11 | ✔️ (gym) | ❌ | ✔️ | ❌ | TF1 | |
Stable-Baselines3 | 7 | ✔️ (gym) | ❌ | ❌ | ✔️ | PyTorch | |
Ray/RLlib | 16 | ✔️ | ✔️ | ✔️ | ✔️ | TF/PyTorch | |
SpinningUp | 6 | ✔️ (gym) | ❌ | ❌ | ❌ | PyTorch | |
Dopamine | 7 | ❌ | ❌ | ❌ | ❌ | TF/JAX | |
ACME | 14 | ✔️ (dm_env) | ❌ | ✔️ | ✔️ | TF/JAX | |
keras-rl | 7 | ✔️ (gym) | ❌ | ❌ | ❌ | Keras | |
cleanrl | 9 | ✔️ (gym) | ❌ | ❌ | ❌ | poetry | |
rlpyt | 11 | ❌ | ❌ | ✔️ | ✔️ | PyTorch | |
ChainerRL | 18 | ✔️ (gym) | ❌ | ✔️ | ❌ | Chainer | |
Tianshou | 20 | ✔️ (Gymnasium) | ❌ | ✔️ | ✔️ | PyTorch | |
JoyRL | 12 | ✔️ (Gymnasium) | ✔️ | ✔️ | ✔️ | PyTorch |
Here are some other highlghts of JoyRL:
- Provide a series of Chinese courses JoyRL Book (with the English version in progress), suitable for beginners to start with a combination of theory
John Jim Peking University |
Qi Wang Shanghai Jiao Tong University |
Yiyuan Yang University of Oxford |