A Job-Shop Scheduling problem (JSSP) and Ressource-Constrained Planning Scheduling Problem (RCPSP) solver based on Reinforcement Learning, targeted to solving real-world industrial problems, and more.
This repo contains the official implementation of Learning to Solve Job Shop Scheduling under Uncertainty, published at CPAIOR 2024 (slides).
- Trains a scheduler for fixed or problems with uncertainty
- Support for training over random problems and generalize
- Support for training over problems with bounded but uncertain durations
- Reads JSSP in Taillard format, extended for uncertain durations
- Web live training metrics reported with Visdom
- Includes schedule visualization as Gantt charts
- Compares to OR-Tools
- Relies on state-of-the art Deep Learning libraries: written with Pytorch, and DGL for graph neural networks
- Install pytorch for your hardware (instructions here)
- Install dgl for your hardware (instructions here)
- Install other dependencies:
pip install -r requirements.txt
Note: for windows users, we strongly recommend to use anaconda
See JSSP, PSP, ARGUMENTS and ADVICE for more information.
If you want to contribute to wheatley, make sure to install the pre-commit hooks:
pre-commit install
- Wheatley learns how to schedule well and generalize over problems and/or uncertainty. It works from a representation of the schedule state-space directly, as opposed to the state-space of jobs and machines.
- Uses PPO as the main RL algorithm
- Captures schedules in the form of graphs and trains with an underlying Graph Neural Network
- Large number of hyper-parameters, default values are set to the best currently known values
- A small choice of different rewards is implemented.
- Wheatley first intended to replicate L2D, a model proposed in this paper: Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning
- Uses some intuitions and ideas from A Reinforcement Learning Environment For Job-Shop Scheduling
- Rewards are normalized
- Wheatley uses proper batching and parallel environments
- Wheatley uses advanced GNN, such as gatv2 (with edge info) thanks to DGL.
- Wheatley embeds more information into every node of the schedule graph (like propagated time bounds), yielding more informed policies
- Wheatley has support for bounded uncertain durations, including at node and reward levels.
@inproceedings{wheatley-jssp,
title={Learning to Solve Job Shop Scheduling under Uncertainty},
author={Guillaume Infantes and St/'ephanie Roussel and Pierre Pereira and Antoine Jacquet and Emmanuel Benazera},
booktitle={21th International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR)},
year={2024},
}