Knowledge Sharing and Transfer via Centralized Reward Agent for Multi-Task Reinforcement Learning

The codes for the Centralized Reward Agent (CenRA) framework of MTRL.

CenRA consists of two components: one centralized reward agent (CRA) and multiple distributed policy agents for their corresponding tasks. The CRA is responsible for learning a reward model to share and transfer task-relevant knowledge to the policy agents.

Requirements

The code is only supported for Python 3.6 to 3.10. (Due to the PyBullet rendering package, the code is not supported for Python higher than 3.11.)
This code has been tested on:

pytorch==2.0.1+cu117

Install all dependent packages:

pip3 install -r requirements.txt

For the MujocoCar environment, refer to this instruction for detailed installation.

Run CenRA Algorithm

Run the following command to train CenRA on different environments specified by <Environment>:

python run-<Environment>.py

All available environments with corresponding <Environment> are listed below:

2DMaze environment: 2dmaze, running script.
3DPickup environment: 3dpickup, running script.
MujocoCar environment: mujococar, running script.

All hyper-parameters are set as default values in the code. You can change them by adding arguments to the command line. Some selected available arguments are listed below, for the full list, please refer to the running scripts run-<Environment>.py.

--exp-name: the name of the experiment, to record the tensorboard and save the model.

--suggested-reward-scale: the scale of the knowledge reward, default is 1.
--lamb: the weight of the knowledge reward, default is 0.5.

--total-timesteps: the total timesteps to train the agent.
--pa-learning-starts: the burn-in steps of the distributed policy agent.
--ra-learning-starts: the burn-in steps of the centralized reward agent.

--pa-buffer-size: the buffer size of the policy agent.
--pa-batch-size: the batch size of the policy agent
--ra-batch-size: the batch size of the reward agent

Comparative Evaluation

The comparison of CenRA with several baselines, including the backbone algorithms DQN (Mnih et al. 2015) for discrete control and SAC (Haarnojaet al. 2018) for continuous control, ReLara (Ma et al. 2024), PiCor (Bai et al. 2023) and MCAL (Mysore et al. 2022).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
CenRA		CenRA
RLEnvs		RLEnvs
readme-images		readme-images
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run-2dmaze.py		run-2dmaze.py
run-3dpickup.py		run-3dpickup.py
run-mujococar.py		run-mujococar.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge Sharing and Transfer via Centralized Reward Agent for Multi-Task Reinforcement Learning

Requirements

Run CenRA Algorithm

Comparative Evaluation

About

Releases

Packages

Languages

mahaozhe/CenRA

Folders and files

Latest commit

History

Repository files navigation

Knowledge Sharing and Transfer via Centralized Reward Agent for Multi-Task Reinforcement Learning

Requirements

Run CenRA Algorithm

Comparative Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages