Data-Centric Human Preference Optimization with Rationales

Hoang Anh Just¹ , Ming Jin¹ , Anit Sahu² , Huy Phan² , Ruoxi Jia¹
¹Virginia Tech ²Amazon
Paper: https://arxiv.org/abs/2407.14477

This repository contains the code for our paper Data-Centric Human Preference Optimization with Rationales.

We propose to enhance existing preference learning frameworks with rationales to explain the reasons.

Dataset

We have generated the dataset based on the prompts in our paper.

We currently provide the rationale-enhanced datasets for the Intel-ORCA-DPO-pairs:

with general rationales: https://huggingface.co/datasets/redsgnaoh/orcaratgen
with detailed rationales: https://huggingface.co/datasets/redsgnaoh/orcaratspec

Running the Preference Learning with Rationales Code

To run any model with any dataset, please edit the configuration files and run the python file, e.g. to train Mistral-7B-Instruct-v0.2 with RDPO (DPO with Rationales) loss, we can run the following line:

python train.py loss=rdpo ++loss.gamma=0.001 lr=5e-7 model=mistral7bv2 datasets=[orcaratspec] exp_name=rdpo_g0-001_lr5e-7_orcaratspec_mistral7b2 mode=train

Use rorpo-simple for the RORPO loss and rdpo for the RDPO loss.

Sampling for AlpacaEval Evaluation

To sample the trained model to evaluate on the AlpacaEval 2.0 benchmark, please run the following prompt (with the above-trained model):

python eval.py --config-path=data/rdpo_g0-001_lr5e-7_orcaratspec_mistral7b2 --config-name=config ++mode=alpacaeval ++n_samples=805 ++model.eval_batch_size=35 ++samples_dir=folder_for_alpaca_samples ++exp_name=your_exp_name

Feel free to evaluate on other prompts as well, following the code repository below.

Codebase

The code is based on this repository: https://github.com/ContextualAI/HALOs/. We greatly appreciate the authors for providing the user-friendly code.

For more details to edit the code, please check out their repository.

Bugs/Questions

Please feel free to contact us for any questions, suggestions, and comments. Thank you for your help!

Citation

Please cite our paper if you find the repo helpful in your work:

@misc{just2024datacentrichumanpreferenceoptimization,
      title={Data-Centric Human Preference Optimization with Rationales}, 
      author={Hoang Anh Just and Ming Jin and Anit Sahu and Huy Phan and Ruoxi Jia},
      year={2024},
      eprint={2407.14477},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2407.14477}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
config		config
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
compare.py		compare.py
dataloader.py		dataloader.py
environment.yaml		environment.yaml
eval.py		eval.py
models.py		models.py
train.py		train.py
trainers.py		trainers.py
utils.py		utils.py
wrationales.gif		wrationales.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Centric Human Preference Optimization with Rationales

Dataset

Running the Preference Learning with Rationales Code

Sampling for AlpacaEval Evaluation

Codebase

Bugs/Questions

Citation

About

Releases

Packages

Languages

License

reds-lab/preference-learning-with-rationales

Folders and files

Latest commit

History

Repository files navigation

Data-Centric Human Preference Optimization with Rationales

Dataset

Running the Preference Learning with Rationales Code

Sampling for AlpacaEval Evaluation

Codebase

Bugs/Questions

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages