Implementation of the paper Unraveling the Rainbow: can value-based methods schedule?
- python
$\ge$ 3.11.11 - pytorch
$\ge$ 2.4.0 - gym
$\ge$ 0.22.0 - numpy
$\ge$ 2.0.2 - pandas
$\ge$ 2.2.3
/data_dev
and/data_test
are the directories storing the validation and testing instances for both the job-shop (JSSP) and flexible job-shop scheduling (FJSP) problems, of all different sizes considered in our work./env
contains the implementation of the problem environment./models
contains the implementation of the models for the value-based algorithms and PPO./network
contains utilities code for the heterogeneous GNN model, multi-layer perceptron and noisy linear layers./results
contains the results on each individual benchmark instance used in our paper./save
is the directory used to store all different models trained./utils
contains different utilities used during training and testing.config.json
is the configuration file, which has all different parameters that must be adjusted for different scenarios.PPO_model.py
contains the implementation of the algorithms in this article, including HGNN and PPO algorithmstest.py
for testingtrain_dqn.py
is the training file for value-based algorithms.train_ppo.py
is the training file for the PPO algorithm.validate.py
is the file used for performing the validation steps.test.py
is the testing file for randomly generated instances.test_benchmark.py
is the testing file for benchmark instances.
In our paper, we perform multiple different experiments on both the JSSP and FJSP, covering multiple instance sizes and algorithms.
To run each one, change the config.json
file accordingly, specifically the following parameters:
- Problem type: set
is_fjsp
inenv_paras
totrue
for FJSP experiments, orfalse
for JSSP experiments. - Instance size: adjust
num_jobs
andnum_mas
inenv_paras
to match the problem instance (e.g., 20 jobs and 10 machines). - Algorithm selection: use the
config_name
field in eitherdqn_paras
orppo_paras
to select the desired algorithm (e.g.,"DQN"
,"PPO"
). If using any Rainbow extensions, set the boolean parameters accordingly inextensions_paras
(e.g.,"use_noisy": true
). - Testing: for evaluation, configure
test_paras
appropriately (e.g., set"is_ppo": true
if using PPO, specify the instance size used during training by settingsaved_model_num_jobs
andsaved_model_num_mas
accordingly)
For training any value-based algorithm, after setting the correct parameters in config.json
, run the following command:
python train_dqn.py
For the PPO algorithm, run:
python train_ppo.py
For testing, just set the boolean variable is_fjsp
in env_paras
accordingly.
For randomly generated problems, run:
python test.py
For benchmark instances, do not forget to set the proper path (benchmark_path
in test_paras
) for the benchmark dataset to be used for evaluation. To properly save the results on each individual instance on an excel, change the output_path
variable name as needed, at the end of the test_benchmark.py
file. After, run:
python test_benchmark.py
We would like to thank the following repositories, which served as important foundations for our work:
- https://github.com/songwenas12/fjsp-drl/tree/main
- https://github.com/Curt-Park/rainbow-is-all-you-need
If you find our work valuable for your research, please cite us:
@misc{corrêa2025unravelingrainbowvaluebasedmethods,
title={Unraveling the Rainbow: can value-based methods schedule?},
author={Arthur Corrêa and Alexandre Jesus and Cristóvão Silva and Samuel Moniz},
year={2025},
eprint={2505.03323},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.03323},
}