QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning

QLESS integrates gradient quantization with the LESS framework to enable memory-efficient data valuation and selection for fine-tuning large language models (LLMs). By combining LoRA-based random projection with absmax quantization, QLESS reduces gradient storage by up to 16x while maintaining competitive performance on benchmarks like MMLU, BBH, and TyDiQA. This repository provides code for reproducing experiments and applying QLESS to custom datasets.

Setup

Docker Environment

We ran all experiments using the nvcr.io/nvidia/pytorch:23.12-py3 Docker image. To get started, ensure you have Docker installed and pull the image.

Install Dependencies

After configuring your Docker environment (or your native environment), install the required dependencies:

pip install -r requirements.txt
pip install evaluate
pip install traker[fast]
pip install hqq
pip install wandb
pip install -e .

Download Data

The necessary datasets are available at https://huggingface.co/datasets/mosesananta/qless_data. Download and unzip the data into the root directory of this repository.

Run Full Experiment

Execute the following script to run the complete experimental pipeline:

./full_run.sh <model_path> <output_model_name> <seed>

For instance, to run the experiment using the meta-llama/Llama-3.2-3B model:

./full_run.sh "meta-llama/Llama-3.2-3B" "meta-llama-3.2-3b" 3

This will create 3 folders

out: Contains all the models and evaluation results
grads_16bit: Contains all the gradient data
selected_datas: Contains the data selected through influence-based selection.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
full_run.sh		full_run.sh
quantize_gradients.py		quantize_gradients.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning

Setup

Docker Environment

Install Dependencies

Download Data

Run Full Experiment

About

Releases

Packages

Languages

License

mosesananta/QLESS

Folders and files

Latest commit

History

Repository files navigation

QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning

Setup

Docker Environment

Install Dependencies

Download Data

Run Full Experiment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages