Official Python implementation of Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization (ICML 2024 & Spotlight at NeurIPS 2023 ALOE Workshop)
Li Ding, Jenny Zhang, Jeff Clune, Lee Spector, Joel Lehman
TL;DR: QDHF enhances QD algorithms by inferring diversity metrics from human judgments of similarity, surpassing state-of-the-art methods in automatic diversity discovery in robotics & RL tasks and significantly improving performance in open-ended generative tasks.
QDHF (right) improves the diversity in text-to-image generation results compared to best-of-N (left) using Stable Diffusion.
- 2024-06-24: Release of the QDHF Gradio Demo on Hugging Face.
- 2024-03-14: Release of the QDHF tutorial in pyribs.
- 2023-12-13: Initial release of the codebase.
We have released a Gradio Demo on Hugging Face. This user-friendly interface enables effortless exploration of QDHF without any coding requirements. Special thanks to Jenny Zhang for her contributions!
We have released a tutorial: Incorporating Human Feedback into Quality Diversity for Diversified Text-to-Image Generation, together with the pyribs team. This tutorial features a lightweight version of QDHF and runs on Google Colab in ~1 hour. Dive into the tutorial to explore how QDHF enhances GenAI models with diversified, high-quality responses and apply these insights to your projects!
To install the requirements, run:
pip install -r requirements.txt
For each experiment, we provide a main.py
script to run the experiment. For example, to run the robotic arm experiment, run:
cd arm
python3 main.py
Replace arm
with the name of the experiment you want to run.
If you find our work or any of our materials useful, please cite our paper:
@inproceedings{
ding2024quality,
title={Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization},
author={Li Ding and Jenny Zhang and Jeff Clune and Lee Spector and Joel Lehman},
booktitle={Forty-first International Conference on Machine Learning},
year={2024},
url={https://openreview.net/forum?id=9zlZuAAb08}
}
This project is under the MIT License.
The main structure of this code is modified from the DQD. Each experiment contains its own modified version of pyribs, a quality diversity optimization library. The maze navigation experiment uses a modified version of Kheperax. The LSI experiment uses Stable Diffusion (huggingface/diffusers), OpenAI CLIP, and DreamSim. The funding acknowledgments are disclosed in the paper.