Counterfactual experience augmented off-policy reinforcement learning

Code of Counterfactual Experience Augmented Off-policy Reinforcement Learning. The paper has been accepted by Neurocomputing.

Counterfactual experience augmentation method refers to utils/CEA.py.

The maximum entropy sampling method can be referenced in a separate repository: https://github.com/Aegis1863/HdGkde

Requirements

python 3.8, torch, numpy, pandas, seaborn, tqdm, gymnasium, scikit-learn

To run our method

Continuous control:

python .\DDPG.py -w 1 --sta --per -t pendulum

python .\DDPG.py -w 1 --sta --per -t lunar

Discrete Control:

python .\RDQN.py -w 1 --sta --sta_kind regular -t sumo

python .\RDQN.py -w 1 --sta --sta_kind regular -t highway

terminal parameters:
- -w: 1 for save data, 0 for test and do not save data;
- -t task: pendulum, lunar; sumo highway;

Then data will be in data\plot_data\{task}\{model_name}\{...}.csv.

Cite

@article{LEE2025130017,
    title = {Counterfactual experience augmented off-policy reinforcement learning},
    journal = {Neurocomputing},
    pages = {130017},
    year = {2025},
    issn = {0925-2312},
    doi = {https://doi.org/10.1016/j.neucom.2025.130017},
    url = {https://www.sciencedirect.com/science/article/pii/S0925231225006897},
    author = {Sunbowen Lee and Yicheng Gong and Chao Deng},
    keywords = {Reinforcement learning, Variational autoencoder, Counterfactual inference, Bisimulation},
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
env		env
model/sta		model/sta
sumo_rl		sumo_rl
utils		utils
.gitignore		.gitignore
DDPG.py		DDPG.py
MBPO_cons.py		MBPO_cons.py
MBPO_disc.py		MBPO_disc.py
PPO_cons.py		PPO_cons.py
PPO_disc.py		PPO_disc.py
RDQN-no_per.py		RDQN-no_per.py
RDQN.py		RDQN.py
README.md		README.md
SAC_cons.py		SAC_cons.py
SAC_disc.py		SAC_disc.py
highway_RDQN.py		highway_RDQN.py
sumo_PPO.py		sumo_PPO.py
train_sta.py		train_sta.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Counterfactual experience augmented off-policy reinforcement learning

Requirements

To run our method

Cite

About

Releases

Packages

Languages

Aegis1863/CEA

Folders and files

Latest commit

History

Repository files navigation

Counterfactual experience augmented off-policy reinforcement learning

Requirements

To run our method

Cite

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages