karthik-nexusflow

Follow

karthik-nexusflow

Follow

Popular repositories Loading

trlx trlx Public

Forked from thwu1/trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python
OpenRLHF OpenRLHF Public

Forked from OpenRLHF/OpenRLHF

A Ray-based High-performance RLHF framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)

Python
arena-hard-auto arena-hard-auto Public

Forked from lmarena/arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Jupyter Notebook