Restaurant Meal Delivery Problem (RMDP) with Reinforcement Learning

A comprehensive Python implementation of a reinforcement learning approach to optimize food delivery logistics, addressing the Restaurant Meal Delivery Problem through an enhanced Anticipatory Customer Assignment (ACA) framework. This project introduces RL-ACA - a novel algorithm that uses dynamic postponement strategies learned through reinforcement learning.

Overview

This thesis project tackles the Restaurant Meal Delivery Problem using real-world Meituan data (647,395 orders across 22 districts), developing an RL-enhanced algorithm that adapts postponement decisions to optimize delivery operations. The implementation addresses the $894 billion meal delivery industry's need for efficient solutions to dynamic challenges like stochastic demand and time-sensitive deliveries.

Key Contributions:

RL-ACA Algorithm: Novel reinforcement learning approach for dynamic postponement in delivery assignment
Real-world Validation: Comprehensive benchmarking on Meituan dataset across 176 scenarios
Multi-stakeholder Optimization: Balances efficiency gains for drivers/platforms with service quality for customers/restaurants
Adaptive Decision Making: Learns optimal assignment windows through feature engineering and temporal patterns

Features

RL-ACA Algorithm: Dynamic postponement using Deep Q-Network with state features (time, congestion, bundling potential)
Comprehensive Simulation: 12-hour operational periods with real Meituan order patterns and timing
Multi-Method Comparison: Benchmarks against ACA-17, Fastest ACA with statistical significance testing
Real-world Integration: Uses actual restaurant locations, delivery deadlines, and preparation times
Performance Analytics: Detailed KPI tracking across district sizes, temporal patterns, and stress levels
Visualization Tools: Route optimization display and performance monitoring dashboards

Project Structure

thesis/
├── environment/                # Core simulation environment
│   ├── route_processing/       # Route calculation and optimization
│   ├── meituan_data/          # Real-world data integration utilities
│   ├── location_manager.py    # Geographic and distance management
│   ├── order_manager.py       # Order lifecycle and validation
│   ├── vehicle_manager.py     # Fleet management and tracking
│   └── visualization.py       # Real-time delivery visualization
├── models/                     # Algorithm implementations
│   ├── aca_policy/            # Enhanced ACA with postponement logic
│   ├── fastest_bundling/      # Order bundling optimization
│   └── fastest_vehicle/       # Baseline nearest-vehicle assignment
├── training/                   # RL training infrastructure
│   ├── config/                # Training configurations and hyperparameters
│   ├── core/                  # Episode management and statistics
│   └── utils/                 # Training utilities and metrics
├── benchmarking/              # Comprehensive performance analysis
│   ├── detailed_performance_analysis/  # District and demand analysis
│   ├── postponement_analysis/ # Postponement strategy evaluation
│   └── algorithm_benchmarking.py       # Multi-method comparison
├── data/                      # Datasets and results
│   ├── meituan_benchmark/     # Real-world Meituan data (647K orders)
│   ├── simulation_results/    # Algorithm performance outputs
│   └── processing_data_scripts/        # Data analysis and visualization
├── config.yaml               # Main simulation configuration
├── train_rl.py               # RL training entry point
└── datatypes.py              # Core data structures and types

Installation

Clone the repository:

git clone https://github.com/TristanKruse/RMDP_Algorithm.git
cd RMDP_Algorithm

Set up a Python environment (Python 3.8+ recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Usage

Training RL-ACA Model

Configure training parameters: Edit config.yaml and training configs in training/config/
Train the RL model:
```
python train_rl.py
```
Monitor training progress: Check loss convergence and postponement rate stabilization

Running Benchmarks

Single algorithm evaluation:

python benchmarking/algorithm_benchmarking.py

Comprehensive analysis:

python benchmarking/detailed_performance_analysis/run_all_analyses.py

Postponement strategy analysis:

python benchmarking/postponement_analysis/investigate_postponement_strategy.py

Simulation Scenarios

Real-world validation: 176 Meituan scenarios (22 districts × 8 days)
Filtered dataset: 120 validated scenarios after quality control
Demand classification: Low/Medium/High based on total delay terciles
Temporal analysis: Weekend vs weekday performance patterns

Key Configuration Options

# config.yaml
simulation:
  duration_hours: 12          # 10:00-22:00 operational window
  timestep_seconds: 30        # Simulation granularity
  
environment:
  vehicle_ratio: 0.54         # Couriers per restaurant
  travel_speed_kmh: 8         # Urban delivery speed
  
rl_training:
  learning_rate: 0.0005
  discount_factor: 0.95
  batch_size: 32
  target_update_frequency: 25

Research Impact & Applications

Academic Contributions:

Novel RL approach to dynamic postponement in delivery logistics
Comprehensive benchmarking framework for RMDP algorithms
Statistical validation with real-world data across multiple contexts
Feature engineering insights for delivery optimization

Industry Applications:

Delivery Platforms: Enhanced assignment algorithms for complex urban environments
Fleet Management: Dynamic postponement strategies for better resource utilization
Urban Logistics: Scalable solutions for high-demand delivery scenarios
Algorithm Integration: RL postponement modules for existing dispatch systems

Future Research Directions:

Spatial density features for enhanced decision-making
Stochastic travel times and courier rejection modeling
Multi-objective optimization across stakeholder priorities
Real-world pilot testing and validation

Technical Specifications

Machine Learning:

Deep Q-Network (DQN) with experience replay
State space: 7 features (time, congestion, bundling potential, etc.)
Action space: Binary postponement decisions
Reward function: Bundling optimization with penalty terms

Data Processing:

647,395 real orders from Meituan dataset
Geographic clustering and demand pattern analysis
Statistical significance testing (paired t-tests, effect sizes)
Demand classification using total delay terciles

Performance Metrics:

On-time delivery rate, average/maximum delay
Distance efficiency, idle time, total system delay
Postponement rates and bundling effectiveness
Multi-stakeholder KPI framework

License

This project is licensed under the MIT License - see the License file for details.

References

Ulmer, M. W., Thomas, B. W., Campbell, A. M., & Woyak, N. (2021). The restaurant meal delivery problem: Dynamic pickup and delivery with deadlines and random ready times. Transportation Science, 55(1), 75-100.
Meituan Challenge Dataset. Restaurant meal delivery optimization competition data.

Contact

Author: Tristan Kruse
Email: krusetristan1@gmail.com
Institution: Master's Thesis Project
Repository: RMDP_Algorithm

For questions about the research, implementation details, or collaboration opportunities, please open an issue on GitHub or contact directly.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.vscode		.vscode
environment		environment
models		models
nn_animation		nn_animation
training		training
.gitignore		.gitignore
License		License
README.md		README.md
__init__.py		__init__.py
config.yaml		config.yaml
datatypes.py		datatypes.py
pyproject.toml		pyproject.toml
rl_hyperparameter_tuning.py		rl_hyperparameter_tuning.py
train_rl.py		train_rl.py
tune_buffer.py		tune_buffer.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Restaurant Meal Delivery Problem (RMDP) with Reinforcement Learning

Overview

Features

Project Structure

Installation

Usage

Training RL-ACA Model

Running Benchmarks

Simulation Scenarios

Key Configuration Options

Research Impact & Applications

Technical Specifications

License

References

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

TristanKruse/RMDP_Algorithm

Folders and files

Latest commit

History

Repository files navigation

Restaurant Meal Delivery Problem (RMDP) with Reinforcement Learning

Overview

Features

Project Structure

Installation

Usage

Training RL-ACA Model

Running Benchmarks

Simulation Scenarios

Key Configuration Options

Research Impact & Applications

Technical Specifications

License

References

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages