Cached Multi-Lora Composition for Multi-Concept Image Generation

📜 Description

Low-Rank Adaptation (LoRA) has emerged as a widely adopted technique in text-to-image models, enabling precise rendering of multiple distinct elements, such as characters and styles, in multi-concept image generation. However, current approaches face significant challenges when composing these LoRAs for multi-concept image generation, particularly as the number of LoRAs increases, resulting in diminished generated image quality.

We devise a frequency domain based sequencing strategy to determine the optimal order in which LoRAs should be integrated during inference. This strategy offers a methodical and generalizable solution compared to the naive integration commonly found in existing LoRA fusion techniques. To fully leverage our proposed LoRA order sequence determination method in multi-LoRA composition tasks, we introduce a novel, training-free framework, Cached Multi-LoRA (CMLoRA), designed to efficiently integrate multiple LoRAs while maintaining cohesive image generation. With its flexible backbone for multi-LoRA fusion and a non-uniform caching strategy tailored to individual LoRAs, CMLoRA has the potential to reduce semantic conflicts in LoRA composition and improve computational efficiency.

The figures below illustrate differences between the traditional LoRA composition approaches and our proposed CMLoRA method:

Example 1. Generated images with different $N$ LoRA candidates ($L1$ Character, $L2$ Clothing, $L3$ Style, $L4$ Background and $L5$ Object) across our proposed framework and baseline methods.

Example 2. Generated images with different $N$ LoRA candidates ($M1$ Character, $M2$ Clothing, $M3$ Style, $M4$ Background and $M5$ Object) across our proposed framework and baseline methods.

Example 3. Generated images with different $N$ LoRA candidates ($R1$ Character, $R2$ Clothing, $R3$ Style, $R4$ Background and $R5$ Object) across our proposed framework and baseline methods.

🚀 Getting Started

Setting Up the Environment

To begin, set up your environment with the necessary packages:

conda create --name cmlora python=3.10
conda activate cmlora
pip install -r requirements.txt # Inference
pip install -r requirements1.txt # Evaluation

Downloading Pre-trained LoRAs

ComposLoRA testbed proposed by Multi-LoRA-Composition collects 22 pre-trained LoRAs, spanning characters, clothing, styles, backgrounds, and objects. Download ComposLoRA.zip from this link, put it in the models folder, and unzip it.

🖼️ Image Generation with Multi-LoRA Composition

Refer to example.py for the full code, and adjust the following command to see results from different composition methods with different LoRA caching strategies:

python example.py --method composite

🌟 Multi-LoRA Composition Pipelines

We provide five multi-LoRA composition pipelines:

1. `compose_no_lora`

The method that performs diffusion using prompt-only generation without incorporating any LoRAs.

2. `compose_offline`

A multi-LoRA composition method that utilizes an offline LoRA disparity method. This method partitions LoRAs based on the profiled frequency domain based sequencing strategy.

3. `compose_online`

This method uses an online frequency partition strategy. LoRAs are dynamically partitioned based on the amplitude of high frequency components during the generative process.

4. `compose_async`

This composition pipeline leverages asynchrony in two components: the LoRA switch mechanism and the decaying dominant weight scale $w_{\text{dom}}$.

5. `compose_research`

A advanced pipeline for research purposes, this pipeline includes functions to plot the intensity of the high-frequency components and measure the similarity of latent feature maps during the denoising process.

Additional Resources

We provide the fft_analysis.ipynb notebook for analyzing the amplitude of the frequency spectrum for different images. This tool can help visualize and examine how the frequency components vary across different generated images.
We implement a revised LoRA Switch method with profiled LoRA partition in callbacks_sa.py, while the standard version is available in callbacks_origin.py.

Citation

If you find this work useful, please kindly cite our paper:

@article{zou2025cached,
  title={Cached Multi-Lora Composition for Multi-Concept Image Generation},
  author={Zou, Xiandong and Shen, Mingzhu and Bouganis, Christos-Savvas and Zhao, Yiren},
  journal={arXiv preprint arXiv:2502.04923},
  year={2025}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LoRA_Cache		LoRA_Cache
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analysis.py		analysis.py
anime_example_info.json		anime_example_info.json
callbacks_origin.py		callbacks_origin.py
callbacks_sa.py		callbacks_sa.py
compose_async.py		compose_async.py
compose_no_lora.py		compose_no_lora.py
compose_offline.py		compose_offline.py
compose_online.py		compose_online.py
compose_research.py		compose_research.py
example.py		example.py
fft_analysis.ipynb		fft_analysis.ipynb
llm.py		llm.py
requirements.txt		requirements.txt
requirements1.txt		requirements1.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cached Multi-Lora Composition for Multi-Concept Image Generation

📜 Description

Example 1. Generated images with different $N$ LoRA candidates ($L1$ Character, $L2$ Clothing, $L3$ Style, $L4$ Background and $L5$ Object) across our proposed framework and baseline methods.

Example 2. Generated images with different $N$ LoRA candidates ($M1$ Character, $M2$ Clothing, $M3$ Style, $M4$ Background and $M5$ Object) across our proposed framework and baseline methods.

Example 3. Generated images with different $N$ LoRA candidates ($R1$ Character, $R2$ Clothing, $R3$ Style, $R4$ Background and $R5$ Object) across our proposed framework and baseline methods.

🚀 Getting Started

Setting Up the Environment

Downloading Pre-trained LoRAs

🖼️ Image Generation with Multi-LoRA Composition

🌟 Multi-LoRA Composition Pipelines

1. `compose_no_lora`

2. `compose_offline`

3. `compose_online`

4. `compose_async`

5. `compose_research`

Additional Resources

Citation

License

About

Releases

Packages

Languages

License

Yqcca/CMLoRA

Folders and files

Latest commit

History

Repository files navigation

Cached Multi-Lora Composition for Multi-Concept Image Generation

📜 Description

Example 1. Generated images with different $N$ LoRA candidates ($L1$ Character, $L2$ Clothing, $L3$ Style, $L4$ Background and $L5$ Object) across our proposed framework and baseline methods.

Example 2. Generated images with different $N$ LoRA candidates ($M1$ Character, $M2$ Clothing, $M3$ Style, $M4$ Background and $M5$ Object) across our proposed framework and baseline methods.

Example 3. Generated images with different $N$ LoRA candidates ($R1$ Character, $R2$ Clothing, $R3$ Style, $R4$ Background and $R5$ Object) across our proposed framework and baseline methods.

🚀 Getting Started

Setting Up the Environment

Downloading Pre-trained LoRAs

🖼️ Image Generation with Multi-LoRA Composition

🌟 Multi-LoRA Composition Pipelines

1. compose_no_lora

2. compose_offline

3. compose_online

4. compose_async

5. compose_research

Additional Resources

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `compose_no_lora`

2. `compose_offline`

3. `compose_online`

4. `compose_async`

5. `compose_research`

Packages