lora: unapply current loras properly #590

stduhpf · 2025-02-07T17:15:30Z

Fixes #410

fszontagh · 2025-02-08T10:26:02Z

I just tested in GUI without checkpoint reload:

run:
Prompt (without lora): a cute cat glass statue
run:
Prompt (with lora): a cute cat glass statue <lora:glass_statue_v1:1>
run:
Prompt (without lora): a cute cat glass statue

No other parameters are changed between the runs, only the prompt.
The first and last output have some little difference.

The trigger word was glass statue, the lora is from here

The parameters:

model: catCitronAnimeTreasure_v10
parser: stable-diffusion.cpp
cfg scale: 7.00
operations: txt2img
scheduler: default
app: StableDiffusionGUI 0.2.5 0d0fc16
size: 512x768
sampler: Euler a
backend: stable-diffusion.cpp (ver.: d46ed5e using cuda)
seed: 473208252
steps: 30
schedule type: Automatic

stduhpf · 2025-02-08T14:32:56Z

Yes there is always going to be some very slight difference due to precision issues. With floats, (a+b)-b might not always be exactly the same as a, but I believe it should be close enough... I wouldn't have noticed the images are different if you didn't point it out. It will probably be worse with very quantized models though, as the errors get bigger.

The only alternative would be to completely reload the model every time the LoRAs are changed, but this would increase the time per generation a bit. (Maybe this could be an optional feature?)

Or have the LoRA weights loaded separately and added in the forward pass during inference but this will require a through refactor of every model, and it will end up making the models run slower with higher memory requirements when using LoRAs.

fszontagh · 2025-02-08T15:02:25Z

but I believe it should be close enough...

Yeah, i think this is acceptable.

The only alternative would be to completely reload the model every time the LoRAs are changed, but this would increase the time per generation a bit. (Maybe this could be an optional feature?)

I think this need to be implemented on the "user side", not in the lib itself. Or it can be faster if the lib itself can reload a model? Or maybe with something "differential" model reload?

But i am now so happy to this fix as is.

Or have the LoRA weights loaded separately and added in the forward pass during inference but this will require a through refactor of every model, and it will end up making the models run slower with higher memory requirements when using LoRAs.

Other implementations how handle this situation?

stduhpf · 2025-02-08T15:07:06Z

Other implementations how handle this situation?

I'm not sure. I know this is how it's implemented in LoRA trainers (because they need gradients for the LoRA and for the base models separately), but for inference, I don't know for sure.

lora: unapply current loras properly

25f87f2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lora: unapply current loras properly #590

lora: unapply current loras properly #590

stduhpf commented Feb 7, 2025

fszontagh commented Feb 8, 2025

stduhpf commented Feb 8, 2025 •

edited

Loading

fszontagh commented Feb 8, 2025

stduhpf commented Feb 8, 2025

lora: unapply current loras properly #590

Are you sure you want to change the base?

lora: unapply current loras properly #590

Conversation

stduhpf commented Feb 7, 2025

fszontagh commented Feb 8, 2025

stduhpf commented Feb 8, 2025 • edited Loading

fszontagh commented Feb 8, 2025

stduhpf commented Feb 8, 2025

stduhpf commented Feb 8, 2025 •

edited

Loading