Datasets collection and preprocessings framework for NLP extreme multitask learning
-
Updated
Jul 10, 2024 - Python
Datasets collection and preprocessings framework for NLP extreme multitask learning
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
[ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
Learning to route instances for Human vs AI Feedback
The code used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging"
Add a description, image, and links to the reward-modeling topic page so that developers can more easily learn about it.
To associate your repository with the reward-modeling topic, visit your repo's landing page and select "manage topics."