Add support for QLoRA/ QAdapter training via bitsandbytes #663

calpt · 2024-03-31T14:49:02Z

This PR adds support for wrapping bitsandbytes' Linear4bit and Linear8bitLt quantization layers with our LoRA implementation, enabling training LoRA adapters on quantized models in QLoRA style.

Implementation is loosely similar to HF peft's approach, which can be found here: https://github.com/huggingface/peft/blob/v0.10.0/src/peft/tuners/lora/bnb.py.

Demo

I've added a new notebook here: https://github.com/calpt/adapter-transformers/blob/dev/qlora/notebooks/QLoRA_Llama_Finetuning.ipynb.
The notebook showcases this feature by finetuning a 4bit-quantized Llama 2 7B on an instruction tuning dataset (similar to Guanaco in the QLoRA paper).
Tested that it runs without errors in the provided notebook, other setups are not extensively tested yet.

Pre-trained checkpoints

Adapters trained with the notebook code can be found here:

Llama-2 7B: https://huggingface.co/AdapterHub/llama2-7b-qlora-openassistant
Llama-2 13B: https://huggingface.co/AdapterHub/llama2-13b-qlora-openassistant

Current limitations

Auto-casting in adapter layer not implemented
Only supports LoRA currently (fix in Fix compatibility of adapters with HF Accelerate auto device-mapping #678 required for support of others)
No support for QLoRA in MergedLinear layer

hSterz

Looks good just one small question about something that is unclear to me

hSterz · 2024-04-22T09:33:54Z

src/adapters/methods/lora.py

+            # result shape: <batch_size> x <seq_len> x <head_dim>
+            layer_output = F.linear(input_states, weight, bias=self.bias)
+        else:
+            layer_output = super().forward(input_states)


Which forward method is called here since this does not inherit from nn.Linear anymore?

the subclasses of this (LoRALinearTorch, LoRALinear4bit, LoRALinear8bitLt), inherit from different types of linear layers

…678) Adapters currently does not work correctly with passing `device_map="auto"` in a model's `from_pretrained()`. Device auto-mapping is handled by HF accelerate, which wraps the original module forward method. This PR fixes compatibility of Adapters' post-hoc model wrapping with Accelerate's device auto-mapping via wrapping the forward pass. Fixing this is required for enabling quantized training of adapters (bottleneck & prefix-tuning) in #663.

calpt and others added 4 commits March 29, 2024 15:02

hacked qlora implementation

bd73dd0

Implement QLoRA merging

d155266

add llama2 qlora notebook

5ee769b

update notebook & clean up

48d8ff4

calpt mentioned this pull request Mar 31, 2024

How to add an adapter to a quantized model without peft? #660

Closed

4 tasks

calpt linked an issue Mar 31, 2024 that may be closed by this pull request

How to add an adapter to a quantized model without peft? #660

Closed

4 tasks

remove unused code

13b5244

calpt linked an issue Apr 1, 2024 that may be closed by this pull request

Not exactly an adapter, but can we implement 8-bit conversion? #394

Closed

calpt and others added 6 commits April 1, 2024 13:47

Move param casting to notebook

aed229e

Merge branch 'main' into dev/qlora

66c3301

notebook hyperparam updates

59798ef

save model at end

b96092b

add notes to notebook

2274f26

Fix compatibility of adapters with HF Accelerate auto device-map

a7ce40d

calpt mentioned this pull request Apr 14, 2024

Fix compatibility of adapters with HF Accelerate auto device-mapping #678

Merged

calpt marked this pull request as ready for review April 14, 2024 20:55

calpt changed the title ~~WIP: Add support for QLoRA training via bitsandbytes~~ Add support for QLoRA training via bitsandbytes Apr 14, 2024

calpt requested review from lenglaender, hSterz and TimoImhof April 14, 2024 20:55

calpt added 4 commits April 14, 2024 21:27

Merge branch 'fix/auto_device_map' into dev/qlora

dbe3591

Update to Llama 3

33623ba

Docs updates

6559cc2

notes

241ea1a

hSterz approved these changes Apr 22, 2024

View reviewed changes

inference fix

a58d33c

calpt changed the title ~~Add support for QLoRA training via bitsandbytes~~ Add support for QLoRA/ QAdapter training via bitsandbytes Apr 23, 2024

calpt merged commit 42c1753 into adapter-hub:main Apr 23, 2024
3 checks passed

calpt deleted the dev/qlora branch April 23, 2024 21:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for QLoRA/ QAdapter training via bitsandbytes #663

Add support for QLoRA/ QAdapter training via bitsandbytes #663

calpt commented Mar 31, 2024 •

edited

Loading

hSterz left a comment

hSterz Apr 22, 2024

calpt Apr 22, 2024

Add support for QLoRA/ QAdapter training via bitsandbytes #663

Add support for QLoRA/ QAdapter training via bitsandbytes #663

Conversation

calpt commented Mar 31, 2024 • edited Loading

Demo

Pre-trained checkpoints

Current limitations

hSterz left a comment

Choose a reason for hiding this comment

hSterz Apr 22, 2024

Choose a reason for hiding this comment

calpt Apr 22, 2024

Choose a reason for hiding this comment

calpt commented Mar 31, 2024 •

edited

Loading