Are there any plans to support GPU offloading with LORAs? #1984

l3utterfly · 2023-06-24T14:48:47Z

Are there any plans to support this?

Reading some of the past issues, seems the main thing blocking is that CUDA uses f32 whilst LORA uses f16 tensors.

Is that still the case?

I can give a shot at implementing this if someone can give me a rough rundown on all the hurdles.

abc-nix · 2023-06-24T15:03:57Z

Hi.

I think @JohannesGaessler has already started to work on it. At least they published #1970 pull request with a rough way to make it work.

JohannesGaessler · 2023-06-24T15:28:57Z

Yes, I have already started working on it due to the advent of SuperHOT and will try to finalize it soon.

JohannesGaessler · 2023-06-24T20:02:45Z

I fixed up the PR. Also I should clarify: the PR only enables CUDA acceleration for f16 models. I previously misunderstood how ggml LoRAs are applied. What needs to be done is to modify the weights with the LoRA which is complicated by the fact that this is done after they're already in VRAM where regular ggml operations can't reach them.

github-actions · 2024-04-10T01:06:48Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions bot added the stale label Mar 25, 2024

github-actions bot closed this as completed Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are there any plans to support GPU offloading with LORAs? #1984

Are there any plans to support GPU offloading with LORAs? #1984

l3utterfly commented Jun 24, 2023

abc-nix commented Jun 24, 2023

JohannesGaessler commented Jun 24, 2023

JohannesGaessler commented Jun 24, 2023

github-actions bot commented Apr 10, 2024

Are there any plans to support GPU offloading with LORAs? #1984

Are there any plans to support GPU offloading with LORAs? #1984

Comments

l3utterfly commented Jun 24, 2023

abc-nix commented Jun 24, 2023

JohannesGaessler commented Jun 24, 2023

JohannesGaessler commented Jun 24, 2023

github-actions bot commented Apr 10, 2024