Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply_loras does not unapply old lora weights #410

Open
maj160 opened this issue Sep 14, 2024 · 6 comments · May be fixed by #590
Open

apply_loras does not unapply old lora weights #410

maj160 opened this issue Sep 14, 2024 · 6 comments · May be fixed by #590

Comments

@maj160
Copy link

maj160 commented Sep 14, 2024

This only matters when the same sd_ctx is used for multiple prompts - Loras that have been applied in a previous prompt but don't appear in the current prompt are not unapplied.

Steps to reproduce:

  • Create an sd_ctx
  • Generate an image using a prompt with a lora
  • Generate an image using a prompt without the lora in the previous step
    Expected results:
  • The lora is no longer applied to the second image
    Actual results:
  • The lora is still applied to the second image

Why this happens:
curr_lora_state contains the loras that are currently applied and their weights. However, inside apply_loras, we only handle loras in the current prompt, and we don't consider any other loras that are in curr_lora_state and might need to be removed.

Potential fix:
In apply_loras(), loop over curr_lora_state and unapply any loras that do not appear in the lora_state (that is, add them to lora_state_diff with negative multiplier)

@maj160
Copy link
Author

maj160 commented Sep 14, 2024

Here's a quick example of the kind of snippet that would fix this (stable_diffusion.cpp:654)

for (auto& kv : curr_lora_state) {
    const std::string& lora_name = kv.first;
    float curr_multiplier        = kv.second;

    // If the lora is no longer in the prompt
    if (lora_state.find(lora_name) == lora_state.end()) {
        float multiplier_diff = -curr_multiplier;
        if (multiplier_diff != 0.f) {
            lora_state_diff[lora_name] = multiplier_diff;
        }
    }
}

There is probably a nicer way to do this, though

@grauho
Copy link
Contributor

grauho commented Sep 15, 2024

I worry about repeated addition and removal of LoRAs in that method introducing a kind of compounding precision error, although perhaps I am overly cautious. I suppose for those desperate to never have to reload the base model file one could store an unaltered set of tensors from the base model and then apply LoRAs to a copy of those tensors whenever the LoRAs being used, or their respective application strengths, change. Naturally this would double the amount of space the base model takes up while the program is running.

@maj160
Copy link
Author

maj160 commented Sep 16, 2024

Thanks - I agree that even under the existing apply_loras mechanism there is potential for precision errors - But that's a much harder problem and not really the focus of this bug report.

This report is about buggy behaviour in the current implementation with a (hopefully!) simple fix, I'd rather it didn't get bogged down and lost in discussion about how LoRAs are handled in general... That feels like a topic for another issue.

@grauho
Copy link
Contributor

grauho commented Sep 16, 2024

The issue is that the two are related. This isn't an issue for the CLI program because neither the model nor LoRAs change from batch image to batch image and even if there are precision errors in the LoRA tensor loading they are reproducible.

However, for people who want to run sdcpp as a server its important to ensure that any solution does not lose precision over time. If it does one could have a situation where the same prompt, with the same settings, with the same seed produces a different result just due to precision decay if many LoRAs have been loaded and unloaded in the intervening time.

The simplest solution would be if when running in this kind of server mode if any of the LoRA settings were to change to just have it reload the model either from a file or from a stored copy of the base model tensors, otherwise it is difficult to guarantee reliable and reproducible behavior.

@taotaow
Copy link

taotaow commented Sep 18, 2024

reload base model is too slow

@piallai
Copy link
Contributor

piallai commented Jan 19, 2025

Here's a quick example of the kind of snippet that would fix this (stable_diffusion.cpp:654)

for (auto& kv : curr_lora_state) {
const std::string& lora_name = kv.first;
float curr_multiplier = kv.second;

// If the lora is no longer in the prompt
if (lora_state.find(lora_name) == lora_state.end()) {
    float multiplier_diff = -curr_multiplier;
    if (multiplier_diff != 0.f) {
        lora_state_diff[lora_name] = multiplier_diff;
    }
}

}

There is probably a nicer way to do this, though

I support this idea. 'Cleaning' unused loras seems relevant to me. There may be precision issues indeed when multiplying the coefficients over and over, but that's the deal with the loras as prompt input anyway. For reproducibility, the parametrization of the loras should be along with the model's, but since the strategy here is to not reload the model each time (speed issue) the lora parameters are changed, this is the price to pay anyway.

@stduhpf stduhpf linked a pull request Feb 7, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants