apply_loras does not unapply old lora weights #410

maj160 · 2024-09-14T15:41:20Z

This only matters when the same sd_ctx is used for multiple prompts - Loras that have been applied in a previous prompt but don't appear in the current prompt are not unapplied.

Steps to reproduce:

Create an sd_ctx
Generate an image using a prompt with a lora
Generate an image using a prompt without the lora in the previous step
Expected results:
The lora is no longer applied to the second image
Actual results:
The lora is still applied to the second image

Why this happens:
curr_lora_state contains the loras that are currently applied and their weights. However, inside apply_loras, we only handle loras in the current prompt, and we don't consider any other loras that are in curr_lora_state and might need to be removed.

Potential fix:
In apply_loras(), loop over curr_lora_state and unapply any loras that do not appear in the lora_state (that is, add them to lora_state_diff with negative multiplier)

maj160 · 2024-09-14T15:49:26Z

Here's a quick example of the kind of snippet that would fix this (stable_diffusion.cpp:654)

for (auto& kv : curr_lora_state) {
    const std::string& lora_name = kv.first;
    float curr_multiplier        = kv.second;

    // If the lora is no longer in the prompt
    if (lora_state.find(lora_name) == lora_state.end()) {
        float multiplier_diff = -curr_multiplier;
        if (multiplier_diff != 0.f) {
            lora_state_diff[lora_name] = multiplier_diff;
        }
    }
}

There is probably a nicer way to do this, though

grauho · 2024-09-15T12:49:09Z

I worry about repeated addition and removal of LoRAs in that method introducing a kind of compounding precision error, although perhaps I am overly cautious. I suppose for those desperate to never have to reload the base model file one could store an unaltered set of tensors from the base model and then apply LoRAs to a copy of those tensors whenever the LoRAs being used, or their respective application strengths, change. Naturally this would double the amount of space the base model takes up while the program is running.

maj160 · 2024-09-16T18:34:23Z

Thanks - I agree that even under the existing apply_loras mechanism there is potential for precision errors - But that's a much harder problem and not really the focus of this bug report.

This report is about buggy behaviour in the current implementation with a (hopefully!) simple fix, I'd rather it didn't get bogged down and lost in discussion about how LoRAs are handled in general... That feels like a topic for another issue.

grauho · 2024-09-16T18:54:11Z

The issue is that the two are related. This isn't an issue for the CLI program because neither the model nor LoRAs change from batch image to batch image and even if there are precision errors in the LoRA tensor loading they are reproducible.

However, for people who want to run sdcpp as a server its important to ensure that any solution does not lose precision over time. If it does one could have a situation where the same prompt, with the same settings, with the same seed produces a different result just due to precision decay if many LoRAs have been loaded and unloaded in the intervening time.

The simplest solution would be if when running in this kind of server mode if any of the LoRA settings were to change to just have it reload the model either from a file or from a stored copy of the base model tensors, otherwise it is difficult to guarantee reliable and reproducible behavior.

taotaow · 2024-09-18T00:25:12Z

reload base model is too slow

piallai · 2025-01-19T17:07:53Z

Here's a quick example of the kind of snippet that would fix this (stable_diffusion.cpp:654)

for (auto& kv : curr_lora_state) {
const std::string& lora_name = kv.first;
float curr_multiplier = kv.second;
// If the lora is no longer in the prompt
if (lora_state.find(lora_name) == lora_state.end()) {
    float multiplier_diff = -curr_multiplier;
    if (multiplier_diff != 0.f) {
        lora_state_diff[lora_name] = multiplier_diff;
    }
}
}

There is probably a nicer way to do this, though

I support this idea. 'Cleaning' unused loras seems relevant to me. There may be precision issues indeed when multiplying the coefficients over and over, but that's the deal with the loras as prompt input anyway. For reproducibility, the parametrization of the loras should be along with the model's, but since the strategy here is to not reload the model each time (speed issue) the lora parameters are changed, this is the price to pay anyway.

stduhpf linked a pull request Feb 7, 2025 that will close this issue

lora: unapply current loras properly #590

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apply_loras does not unapply old lora weights #410

apply_loras does not unapply old lora weights #410

maj160 commented Sep 14, 2024

maj160 commented Sep 14, 2024

grauho commented Sep 15, 2024

maj160 commented Sep 16, 2024

grauho commented Sep 16, 2024

taotaow commented Sep 18, 2024

piallai commented Jan 19, 2025

apply_loras does not unapply old lora weights #410

apply_loras does not unapply old lora weights #410

Comments

maj160 commented Sep 14, 2024

maj160 commented Sep 14, 2024

grauho commented Sep 15, 2024

maj160 commented Sep 16, 2024

grauho commented Sep 16, 2024

taotaow commented Sep 18, 2024

piallai commented Jan 19, 2025