Skip to content

This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).

Notifications You must be signed in to change notification settings

leo-yangli/VB-LoRA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[NeurIPS 2024] VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks

This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks.

🎉 VB-LoRA is now integrated in the 🤗 Hugging Face State-of-the-art Parameter-Efficient Fine-Tuning (PEFT) library. Please check the doc, code, and examples.

Abstract

As the adoption of large language models increases and the need for per-user or per-task model customization grows, the parameter-efficient fine-tuning (PEFT) methods, such as low-rank adaptation (LoRA) and its variants, incur substantial storage and transmission costs. To further reduce stored parameters, we introduce a "divide-and-share" paradigm that breaks the barriers of low-rank decomposition across matrix dimensions, modules and layers by sharing parameters globally via a vector bank. As an instantiation of the paradigm to LoRA, our proposed VB-LoRA composites all the low-rank matrices of LoRA from a shared vector bank with a differentiable top-k admixture module. VB-LoRA achieves extreme parameter efficiency while maintaining comparable or better performance compared to state-of-the-art PEFT methods. Extensive experiments demonstrate the effectiveness of VB-LoRA on natural language understanding, natural language generation, and instruction tuning tasks. When fine-tuning the Llama2-13B model, VB-LoRA only uses 0.4% of LoRA's stored parameters, yet achieves superior results.

Overview of VB-LoRA

Overview of VB-LoRA. Left: The model parameters can be represented as a composition of vectors from a vector bank, which is shared across sub-vectors, modules and layers. Right: Architecture of VB-LoRA. We use a top-k softmax function to select k vectors from the vector bank. The selected vectors are then pooled into a sub-vector, which is arranged at a desired position, forming the parameters of LoRA.

Comparison of the PEFT methods

Comparison with other PEFT methods on RoBERTa-Large. VB-LoRA achieves higher scores with significantly smaller number of stored parameters.

Steps to reproduce the results

NLU

  • Modified code for running experiments for Natural Language Understanding experiments.
  • Adapted from LoRA source code.

Create and activate conda env

cd NLU/NLU
conda env create -f environment.yml
conda activate VB_LoRA_NLU

Install the pre-requisites

vb-lora:

pip install -e ..

NLU:

pip install -e .

Start the experiments

The scripts are located in the "NLU/scripts_vblora_all" and "NLU/scripts_vblora_qv" folders.

For example,

./scripts_vblora_all/roberta_base_cola.sh

Instruction Tuning

  • The code for running Llama2 is adapted from qlora source code.
  • Fine-tuning the Llama2 model requires access to the model weights on HuggingFace. Ensure you have the access before running the code.

Create and activate conda env

cd instruction_tuning
conda create -n instruction_tuning python==3.10
conda activate instruction_tuning

Install the pre-requisites

pip install -r requirements.txt

Start the experiments

The scripts are located in the "instruction_tuning/scripts" folder.

For example,

cd instruction_tuning
./scripts/finetune_llama2_7b_vblora.sh

For evaluation, please use LLM Judge.

Math Instruction Tuning

Create and activate conda env

cd math_instruction_tuning
conda create -n math_instruction_tuning python==3.8.13
conda activate math_instruction_tuning

Install the pre-requisites

pip install -r requirements.txt

Start the experiments

The scripts are located in the "instruction_tuning/scripts" folder.

For example,

./run_instruction_tuning_vblora.sh

Citation

If you found this code useful, please cite our paper.

@inproceedings{li2024vblora,
      title={VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks}, 
      author={Yang Li and Shaobo Han and Shihao Ji},
      booktitle={The 38th Conference on Neural Information Processing Systems (NeurIPS)},
      year={2024}
}

About

This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published