Skip to content
View ReinFlow's full-sized avatar

Block or report ReinFlow

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ReinFlow/README.md

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Architecture Diagram

Shortcut Flow Can Shortcut Transport


Installation | Quick Start | Implementation Details | Add Dataset/Environment
Debug & Known Issues | License | Acknowledgement

πŸš€ About ReinFlow

This is the official implementation of "ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning".

ReinFlow is a flexible policy gradient framework for fine-tuning flow matching policies at any denoising step.

How does it work?
πŸ‘‰ First, train flow policies using imitation learning (behavior cloning).
πŸ‘‰ Then, fine-tune them with online reinforcement learning using ReinFlow!

🧩 Supports:

  • βœ… 1-Rectified Flow
  • βœ… Shortcut Models
  • βœ… Any other policy defined by ODEs (in principle)

πŸ“ˆ Empirical Results: ReinFlow achieves strong performance across a variety of robotic tasks:

  • 🦡 Legged Locomotion (OpenAI Gym)
  • βœ‹ State-based manipulation (Franka Kitchen)
  • πŸ‘€ Visual manipulation (Robomimic)

🧠 Key Innovation: ReinFlow trains a noise injection network end-to-end:

  • βœ… Makes policy probabilities tractable, even with very few denoising steps (e.g., 4, 2, or 1)
  • βœ… Robust to discretization and Monte Carlo approximation errors

Learn more on our πŸ”— project website or check out the arXiv paper.

πŸ“’ News

  • [2025/07/30] Fixed the rendering bug in Robomimic. Now supports rendering at 1080p resolution.
  • [2025/07/29] Add tutorial on how to record videos during evaluation in the docs
  • [2025/06/14] Updated webpage for a detailed explanation to the algorithm design.
  • [2025/05/28] Paper is posted on arXiv!

πŸš€ Installation

Please follow the steps in installation/reinflow-setup.md.

πŸš€ Quick Start: Reproduce Our Results

To fully reproduce our experiments, please refer to ReproduceExps.md.

To download our training data and reproduce the plots in the paper, please refer to ReproduceFigs.md.

πŸš€ Implementation Details

Please refer to Implement.md for descriptions of key hyperparameters of FQL, DPPO, and ReinFlow.

πŸš€ Adding Your Own Dataset or Environment

Please refer to Custom.md.

πŸš€ Debug Aid and Known Issues

Please refer to KnownIssues.md to see how to resolve errors you encounter.

⭐ Comming Soon

  • Support fine-tuning Mean Flow with online RL
  • Possible open-source the WandB projects via a corporate account. (currently is in .csv format)
  • Replace figs with videos in the drop-down menu of specific tasks in the webpage.

License

This repository is released under the MIT license. See LICENSE. If you use our code, we appreciate it if you paste the license at the beginning of the script.

Acknowledgement

This repository was developed from multiple open-source projects. Major references include:

For more references, please refer to Acknowledgement.md.

Star History

Star History Chart

Pinned Loading

  1. ReinFlow ReinFlow Public

    Flow RL. ReinFlow: Fine-tuning Flow Policy with Online RL (Reinforcement Learning).

    Python 34 3