Skip to content

SimpleBerry/LLaMA-O1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace

Towards Open-Source Large Reasoning Models

News

The first version of LLaMA-O1 has been uploaded to HF now! Here We Come!

Supervised:

https://huggingface.co/SimpleBerry/LLaMA-O1-Supervised-1129

Base(Pretrain):

https://huggingface.co/SimpleBerry/LLaMA-O1-Base-1127

Supervised Finetune Dataset:

https://huggingface.co/datasets/SimpleBerry/OpenLongCoT-SFT

Pretraining Dataset:

https://huggingface.co/datasets/SimpleBerry/OpenLongCoT-Pretrain-1202

RLHF is on the way! View our GitHub Repo:

https://github.com/SimpleBerry/LLaMA-O1

Our ongoing related researches:

https://huggingface.co/papers/2406.07394

https://huggingface.co/papers/2410.02884

https://huggingface.co/papers/2411.18203

GGUF:https://huggingface.co/Lyte/LLaMA-O1-Supervised-1129-Q4_K_M-GGUF

Online Demo (CPU-only): https://huggingface.co/spaces/SimpleBerry/LLaMA-O1-Supervised-1129-Demo

RoadMaps of LLaMA-O1

  • Marked Language of Long CoT (Done)
  • Pretrain Dataset (Done)
  • Supervised Dataset (Done)
  • PRM token rectifcation Dataset (Done)
  • Reinforcement Learning With Self-Play (Codes done, training)
  • Inference-time Reasoning Enhancement Frameworks (Codes done, Temporarily postponed)

About

Large Reasoning Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages