Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: SDXL 1.0 base model extremely long loading time #12086

Closed
1 task done
andypotato opened this issue Jul 27, 2023 · 49 comments
Closed
1 task done

[Bug]: SDXL 1.0 base model extremely long loading time #12086

andypotato opened this issue Jul 27, 2023 · 49 comments
Labels
bug-report Report of a bug, yet to be confirmed sdxl Related to SDXL

Comments

@andypotato
Copy link

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What happened?

Loading the SDXL 1.0 base model takes an extremely long time. From my log:

Loading weights [31e35c80fc] from D:\SD\stable-diffusion-webui\models\Stable-diffusion\SDXL\sd_xl_base_1.0.safetensors]
Creating model from config: D:\SD\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_base.yaml
Applying attention optimization: xformers... done.
Model loaded in 115.6s (create model: 0.4s, apply weights to model: 103.6s, apply half(): 7.3s, move model to device: 3.0s, load textual inversion embeddings: 0.3s, calculate empty prompt: 0.9s).

So a total of almost 2 minutes, with most of the time spent on "apply weights to model".

What exactly happens at this step and is there a way to optimize it?

Steps to reproduce the problem

Select the SDXL model from checkpoints

What should have happened?

The model should load in around 10-20 seconds. The SD 1.5 model loads in about 8 seconds for me

Version or Commit where the problem happens

1.5.0

What Python version are you running on ?

Python 3.10.x

What platforms do you use to access the UI ?

Windows

What device are you running WebUI on?

Nvidia GPUs (RTX 20 above)

Cross attention optimization

xformers

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

--xformers

List of extensions

None

Console logs

Loading weights [31e35c80fc] from D:\SD\stable-diffusion-webui\models\Stable-diffusion\SDXL\sd_xl_base_1.0.safetensors]
Creating model from config: D:\SD\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_base.yaml
Applying attention optimization: xformers... done.
Model loaded in 115.6s (create model: 0.4s, apply weights to model: 103.6s, apply half(): 7.3s, move model to device: 3.0s, load textual inversion embeddings: 0.3s, calculate empty prompt: 0.9s).

Additional information

No response

@andypotato andypotato added the bug-report Report of a bug, yet to be confirmed label Jul 27, 2023
@ClashSAN
Copy link
Collaborator

are you using a pagefile or an HDD?

@andypotato
Copy link
Author

I'm loading it from an SSD and no pagefile is used. My configuration:

  • i5 12500
  • RTX 3060/12GB
  • Kingston SSD
  • 32 GB DDR4 / 3200

@ClashSAN
Copy link
Collaborator

I go from 6gb to 20gb on 24gb, I load it with this https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors try that vae.
I get 11.4 s on apply weights to model. 22s in total

@andypotato
Copy link
Author

This is interesting:

I previously had my SDXL models (base + refiner) stored inside a subdirectory named "SDXL" under /models/Stable-Diffusion. Now I moved them back to the parent directory and also put the VAE there, named sd_xl_base_1.0.vae.safetensors.

The loading time is now perfectly normal at around 15 seconds.

No idea what kind of funny bug this is, but this workaround worked for me.

@FjellsMemory
Copy link

FjellsMemory commented Jul 28, 2023

I am having the exact same issue, though my models were never stored in a subdirectory. They have been stored in the proper models folder since I downloaded them. The VAE has not helped the issue in my case. My load time is just about two minutes long. Both for the base and for the refiner. They also almost completely freeze my computer during load at various stretches of up to 30 seconds. Not completely frozen, but almost. All tips, tricks and questions welcome. After the model loads, it runs perfectly fine and image generation times is on par with generation times I see on YouTube. But the loading is absolute torture. No other model acts this way for me. I also have a 3060 12gb and am loading from SSD.

@Alexandre-Fernandez
Copy link

Loading weights [31e35c80fc] from /.../stable-diffusion-webui/models/Stable-diffusion/sd_xl_base_1.0.safetensors
Creating model from config: /.../stable-diffusion-webui/repositories/generative-models/configs/inference/sd_xl_base.yaml
Applying attention optimization: Doggettx... done.
Model loaded in 115.6s (create model: 0.6s, apply weights to model: 110.9s, apply half(): 1.7s, move model to device: 1.2s, load textual inversion embeddings: 0.2s, calculate empty prompt: 0.9s).

The Creating model from config step takes lot of time and RAM for me, I have a 3090 and 32g RAM on a HDD on Ubuntu.
Sometimes it takes so much RAM (and time) that I'm forced to interrupt the process or it freezes my computer.

@jordenyt
Copy link
Contributor

Try setting no cache of checkpoint..... somehow it seems fix the loading issue:
Settings => Stable Diffusion =>"Checkpoint to cache in RAM" => 0

@FjellsMemory
Copy link

Thanks for suggesting; in my case, anyhow, this has apparently always been set to 0, and the issue has presented itself as described.

@andypotato
Copy link
Author

I can confirm that changing this setting to 0 does not resolve the issue

@akx akx added the sdxl Related to SDXL label Jul 31, 2023
@thatjimupnorth
Copy link

I'm also using a 3060 with 12gb of VRAM and the loading times are unbareable. Literally taking 10 to 20 mins. As I write this I'm trying the above mentioned trick of including the VAE file but I don't hold out any hope. This is a fresh clean install of Automatic1111 after I attempted to add the AfterDetailer extension earlier today and it crashed Windows, then refused to restart.

As others have stated, once it does actually load it renders as fast as 1.5 but the load time is torture.
Screenshot 2023-08-07 at 22 50 24

@akashAD98
Copy link

im also getting the exactly the same issue, not able to load the sdxl -1.0 weights,its taking too much time &

@FjellsMemory
Copy link

Well, boys, seeing as almost each person in this thread is a 3060 12gb RAM user... we may have found the element upon which this problem hinges. I'll be interested to see if the community trained XL models present us with the same problem. Maybe someone can try one out and report back. Some are already available on Civitai.

@catboxanon
Copy link
Collaborator

catboxanon commented Aug 8, 2023

FYI: #11958 may have resolved this, or at least partially. Only on the dev branch currently.

@andypotato
Copy link
Author

I tried this branch and didn't notice any big difference in loading speed

@thatjimupnorth
Copy link

Well, boys, seeing as almost each person in this thread is a 3060 12gb RAM user... we may have found the element upon which this problem hinges. I'll be interested to see if the community trained XL models present us with the same problem. Maybe someone can try one out and report back. Some are already available on Civitai.

There's no logic to it. I've been pulling my hair out. It was working absolutely fine - until it wasn't. I've literally emptied the pip cache, deleted the entire installation, and started from scratch TWICE. The only way I was able to get it to launch was by putting a 1.5 checkpoint in the models folder, but as soon as I tried to then load SDXL base model, I got the "Creating model from config: " message for what felt like a lifetime and then the PC restarted itself. 24 hours ago it was cranking out perfect images with dreamshaperXL10_alpha2Xl10.safetensors from Civit.ai and now it's just a really expensive space heater. I don't get it.

@FjellsMemory
Copy link

FjellsMemory commented Aug 8, 2023

fwiw - for me, dreamshaper takes just as long and causes the same load issues as the base model. catastrophic.

@thatjimupnorth
Copy link

If we can get it working in Comfy that would rule out it being a 3060 problem, right? Has anyone tried that?

@FjellsMemory
Copy link

not me, but i bet it works in Comfy even on a 3060 12gb. i just don't have the time to learn a whole new UI atm. maybe someone can chime back in on this.

@andypotato
Copy link
Author

This issue seems exclusive to A1111 - I had no issue at all using SDXL in Comfy

@akashAD98
Copy link

akashAD98 commented Aug 8, 2023

If we can get it working in Comfy that would rule out it being a 3060 problem, right? Has anyone tried that?

i have geforec rtx 3090 24 gb,still same issue

@thatjimupnorth
Copy link

OK, cool. So I'll install Comfy and report back if it works OK with SDXL. At that point I think we're all happy to say it needs to be looked at by the Automatic1111 developer. I've no idea what the process is for that, if anyone here can advise?

@thatjimupnorth
Copy link

OK. So I sent that last message 25 mins ago. I've never used Comfy before. It's now installed, and cranking out perfect images with the SDXL base model.

THIS IS AN AUTOMATIC1111 PROBLEM.

Screenshot 2023-08-08 at 10 45 33

@thatjimupnorth
Copy link

It's also working perfectly with community trained models.

Screenshot 2023-08-08 at 11 01 53

@FjellsMemory
Copy link

FjellsMemory commented Aug 8, 2023

78195996

only problem is, A1111 has some functions that Comfy doesn't yet have, and... the learning curve.

@thatjimupnorth
Copy link

Olivio has a video on his YouTube channel that takes you through installing it with an Automatic1111 style prompting interface. https://www.youtube.com/watch?v=330z7P_m7-c

@sanguivore-easyco
Copy link

sanguivore-easyco commented Aug 8, 2023

Well, boys, seeing as almost each person in this thread is a 3060 12gb RAM user... we may have found the element upon which this problem hinges. I'll be interested to see if the community trained XL models present us with the same problem. Maybe someone can try one out and report back. Some are already available on Civitai.

This is also happening on an AWS g5.4xlarge (which has an Nvidia A10G - 24 gb gpu), so...

@catboxanon
Copy link
Collaborator

I don't want to derail this issue but keep in mind this is a hobby project, whereas the ComfyUI developer is a paid employee of StabilityAI.

@thatjimupnorth
Copy link

I don't want to derail this issue but keep in mind this is a hobby project, whereas the ComfyUI developer is a paid employee of StabilityAI.

It's not a criticism at all. The developer can't test his code on every possible platform with every possible graphics card. We're here to help make it better. I don't like ComfyUI. I can see why people who are more used to a Node based interface like Blender would be into it, but I'm a drag and drop point and click kind of guy. But every "how to do x" tutorial on YouTube starts with installing Automatic1111. If someone wanting to get into it for the first time runs into this problem it could be enough to put them off or cause the less experienced to try something that could actually damage their equipment. So it's important we flag things like this as and when they pop up. But it's totally understandable if it takes a while to fix. I personally wouldn't know where to even start, so I have huge respect for coders and developers who do.

@thatjimupnorth
Copy link

So the feedback in a few different Discord groups is to downgrade back to the previous version of A1111 which was working with SDXL. I have no idea how to do that.

@andypotato
Copy link
Author

There is no A1111 version prior to 1.50 that supports SDXL, at least not directly.

@thatjimupnorth
Copy link

thatjimupnorth commented Aug 9, 2023

That's what I was lead to believe too, but the weird thing about this particular issue is SDXL was working perfectly fine when the A1111 update which supported it was first dropped. Whereas now it doesn't start up at all even on a completely fresh install, and on the odd occasion when you do wait 20 to 40 mins for it to load, it reloads everything all over again when you Apply even the most minor of UI changes, like saving defaults or putting the Clip Skip slider at the top of the UI, for example.

Someone else on Discord was saying it could be the NVidia driver, but that wouldn't explain SDXL working perfectly in ComfyUI,

The mystery deepens.

@andypotato
Copy link
Author

andypotato commented Aug 10, 2023

I found that (for me) the issue will only occur with the .safetensors version of the SDXL models. If I convert the models to .ckpt format, the models will load at normal speed.

Here is the console output of me switching back and forth between the base and refiner models in A1111 1.5.1 (VAE selection set to "Auto"):

Loading weights [f5df61fbb6] from D:\SD\stable-diffusion-webui\models\Stable-diffusion\sd_xl_refiner_1.0.ckpt
Creating model from config: D:\SD\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_refiner.yaml
Applying attention optimization: xformers... done.
Model loaded in 5.2s (create model: 0.2s, apply weights to model: 1.1s, apply half(): 1.7s, move model to device: 1.7s, load textual inversion embeddings: 0.1s, calculate empty prompt: 0.4s).

Loading weights [aef4134af1] from D:\SD\stable-diffusion-webui\models\Stable-diffusion\sd_xl_base_1.0.ckpt
Creating model from config: D:\SD\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_base.yaml
Applying attention optimization: xformers... done.
Model loaded in 6.8s (create model: 0.4s, apply weights to model: 1.8s, apply half(): 2.1s, move model to device: 1.9s, calculate empty prompt: 0.4s).

Loading weights [f5df61fbb6] from D:\SD\stable-diffusion-webui\models\Stable-diffusion\sd_xl_refiner_1.0.ckpt
Creating model from config: D:\SD\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_refiner.yaml
Applying attention optimization: xformers... done.
Model loaded in 5.5s (create model: 0.3s, apply weights to model: 1.3s, apply half(): 1.8s, move model to device: 1.8s, calculate empty prompt: 0.3s).

The times above seem to be measured from the moment the model file was fully loaded from hard disk. So depending on your HDD / SSD speed, the actual time will be longer (around 30-40 seconds for HDD, 5-10 seconds for SSD) - Still a HUGE improvement over the 2-3 minutes it took for the .safetensors version.

RAM usage peaked at about 22 GB during each switch.

This is the tool I used to convert the models:
https://github.com/diStyApps/Safe-and-Stable-Ckpt2Safetensors-Conversion-Tool-GUI

Can anyone reproduce this?

@thatjimupnorth
Copy link

It could be that there's more than one issue, because I can only dream of it "only" taking 2 to 3 minutes.

When you say "RAM usage peaked at about 22 GB during each switch" do you mean VRAM or system RAM?

@thatjimupnorth
Copy link

... for anyone who wants a quick fix, by the way, installing A1111 under Pinokio does allow you to switch between SDXL base model and Community Trained models about as quickly as it does under SD 1.5, so there's definitely something not right with the latest release of A1111 you would get as normal just by running git pull

@andypotato
Copy link
Author

andypotato commented Aug 10, 2023

When you say "RAM usage peaked at about 22 GB during each switch" do you mean VRAM or system RAM?

I'm obviously talking about system RAM. Try the method I suggested and report back if that fixes the issue for you.

@FjellsMemory
Copy link

FjellsMemory commented Aug 11, 2023

converting .safetensors to .ckpt did NOT alter the issue in any way for me. identical results (in my case, load time between 2 and 3 minutes while computer intermittently freezes).

for me, Pinokio install changed my SDXL model loading times from 2-3 minutes to about 40 seconds. an improvement, but not a solution.

@thatjimupnorth
Copy link

Installing under Pinokio isn't the fix I thought it was either. It's now doing the same thing as reported above after briefly working fine. Bottom line is, for some users and for an unknown reason, A1111 is NOT working with SDXL.

@pedroquintanilla
Copy link

I thought it was because of my card which is a nvidia 3070 8ram and takes forever to load xl models but in Comfy if you load fast I hope you can give a solution because I do not like to use Comfy.

@metter
Copy link

metter commented Aug 15, 2023

using --medvram instead of --xformers solved this for me

@FjellsMemory
Copy link

thanks for the suggestion. i'm going to try that today and Fooocus, featured on Sebastian Kamph today. https://www.youtube.com/watch?v=8krykSwOz3E

@FjellsMemory
Copy link

using --medvram instead of --xformers solved this for me

sadly, this changed nothing for me. still frozen, buggy, long loading.

@weajus
Copy link

weajus commented Aug 24, 2023

Seems like it got fixed with --medvram-sdxl. Just checkout to dev (tested on 3070)

@andypotato
Copy link
Author

This issue is actually a combination of several issues

  1. The SDXL base model is around 6 GB in size. Loading this model from a regular HDD will take a good amount of time. So if you're not using a SSD or M2, expect an initial model loading time of about 40-50 seconds. Nothing you can do about this except for buying a faster HDD.

  2. In addition to slow HDD loading times, users with less than 32 GB system RAM (not GPU VRAM) will experience slow loading speeds as the RAM usage for loading SDXL will exceed 16 GB. This will result in your system to start swapping, especially during the loading stage that applies the xformers optimization. This is an issue that will probably be fixed in an upcoming A1111 version. Depending on other factors, even 32 GB may not be enough at this moment.

  3. The above problem gets worse if you have enabled RAM caching of checkpoints in your A1111 settings. Setting the amount of cached checkpoints in RAM to zero will prevent this.

  4. Changing the model format from .safetensors to .ckpt using the suggested tool will drastically reduce the amount of system RAM (not GPU VRAM) needed for loading the model. This will lessen the impact of the issue described in 2)

  5. Switching off xformers as suggested by @metter will not solve the problem, but will actually have negative effects for allmost all users. Xformers is intended to reduce memory usage during image generation and has little impact on the initial loading time - However it will make the issue described in 2) worse if you run into low system RAM.

  6. Any low / med vram command line parameters are not related to this issue. @weajus reported that --medvram-sdxl resolves the issue, however this is not due to the usage of the parameter, but due to the optimized way A1111 now manages system RAM, therefore not running into the issue 2) any longer.

Happy generating everybody!

@FjellsMemory
Copy link

Thanks for the tips. Wil try the suggested "fixes" and report back. Just as a matter of principle however, I find the suggestion of the long loading times being the result of only 16 megabytes system RAM quite interesting as I would expect more users to face this issue if that truly were one of the culprits.

@andypotato
Copy link
Author

It is a combination of slow disk access and low system RAM.

How you run into the low system RAM issue is again a combination of many factors that entirely depend on your setup and configuration. You may even run into this issue with 32 GB of RAM - I sure do when enabling checkpoint caching and have other Apps open while running SD. Using ckpt models also helped me reducing RAM usage.

The upcoming version of A1111 uses a more optimized method of loading models thus reducing the chance of you running into the low RAM situation. However if you're running SD on a potato PC with other Apps open at the same time, you'll most likely see the same issue again.

Just for the record: None of that is related to the VRAM of your GPU. This is only relevant once you hit the red "Generate" button.

@FjellsMemory
Copy link

FjellsMemory commented Aug 24, 2023

Ok again thank you for your input and thank you for at least some suggestions of where to begin, as I have written in this forum I have changed safe tensors to check points and still experienced long loading times. I am running on an SSD with 16 gigabytes system memory. The PC in question is neither a potato nor a high end machine but something in between. I'm also - like most in this thread - running on a 3060 with 12 gigabytes vram and my issues only extends to the loading and not to the image generation, as reported.

@arafatx
Copy link

arafatx commented Oct 5, 2023

This issue seems exclusive to A1111 - I had no issue at all using SDXL in Comfy

I've noticed that this problem is specific to A1111 too and I thought it was my GPU. I encountered no issues when using SDXL in Comfy. When I first learned about Stable Diffusion, I wasn't aware of the many UI options available beyond Automatic1111. Perhaps it's time for us to explore other alternatives. EasyDiffusion, for instance, seems to handle SDXL more efficiently. The only reason I stick with A1111 is because of some plugins.

@ChrisTop700
Copy link

ChrisTop700 commented Dec 26, 2023

So interesting, experiencing same thing but only after first generation on a fresh load of the webui. Looks like some kinda VAE issue, which might be why it worked for some?

image

Same thing, other tools don't seem to cause this issue. Specifying the VAE had same results. Copying VAE just really made it mad as I expected, as its not a valid safetensor file.

--disable-nan-check argument makes it black. --no-half-vae skips the first fast image production.

@rungvang
Copy link

rungvang commented Jan 7, 2024

Have the same problem. A1111 1.7.0. 16gb RAM, 16gb VRam, SDXL model loading time 80-90 sec from SSD, 15-20 sec for SD 1.5 model. But if you already used the SDXL model then turn off and open WebUI again the loading time is fast - 5-10 sec. Not the problem with SSD since with Comfy it load very fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-report Report of a bug, yet to be confirmed sdxl Related to SDXL
Projects
None yet
Development

No branches or pull requests