An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
-
Updated
Jan 8, 2025 - Python
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
A Python wrapper that enables large language models (LLMs) to simulate the step-by-step thinking process of OpenAI’s o1 model, providing users with detailed reasoning and comprehensive answers.
it thinks like openai-o1
Add a description, image, and links to the openai-o1 topic page so that developers can more easily learn about it.
To associate your repository with the openai-o1 topic, visit your repo's landing page and select "manage topics."