-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support SDXL and its distributed inference #1514
Conversation
@Zars19 thanks for the contribution to TensorRT-LLM! @nv-guomingz can you help take care of this? :) Thanks |
Sure, I'll collobrate with @Zars19 for enabling SDXL with TRT-LLM. |
Hi @Zars19 , could u please resolve the code conflicts firstly? |
I have resolved the conflict :) @nv-guomingz |
Hi @Zars19 thanks for your patience. |
@nv-guomingz I completed the git rebase |
Any updates on the code review? |
After rebasing the code, I haven't received feedback for a while now |
Hi @Zars19, thanks for your patience. We recently resumed reviewing this with the latest TRT-LLM and noticed some issues so far:
A few other questions:
|
@hchings Thank you for your feedback! I've rebased with the latest official code, debugged, and updated some issues.
Responses to other issues:
|
Hi @Zars19 , thanks for the fixes!
|
@hchings Thanks for your reply.
import torch
from diffusers import StableDiffusionXLPipeline
pipeline = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
torch_dtype=torch.float16,
use_safetensors=True,
)
pipeline.to('cuda')
seed = 1234
size = 2048
#prompt = "flowers, rabbit"
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipeline(
prompt=prompt,
generator=torch.Generator(device="cuda").manual_seed(seed),
height=size, width=size).images[0]
image.save(f"output.png") Picture generated by the original SDXL:
|
Hi @Zars19, FYI that we're doing some final wrap-ups of this and will merge it soon. Thanks! |
Hi @Zars19, we've merged this internally and it will show up in the upcoming public release under |
The idea of patch parallelism comes from the CVPR 2024 paper Distrifusion. In order to reduce the difficulty of implementation, all communications in the example are synchronous.
This can help SDXL achieve better performance, especially when the resolution is very high
A100, 50 steps, 2048x2048, SDXL