-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RTX 3090 insane low speed #11
Comments
Hi @davizca, please try to set view_batch_size to 16. It should work for 3090 and will make inference faster. |
Hi @RuoyiDu, thanks for the answer. I did view batch size to 16 and for a 2048x2048 on phase 2 decoding is giving me like +12 mins (and it's increasing slowly, I guess its the same as before). 1024x1024 is running super fast though. Settings and screenshot added: Cheers. |
Hi @davizca, this is very strange now. Are you running on a laptop with RTX3090? The power of the GPU also affects the inference time -- I'm using the RTX3090 on a local server with the power of 350W. You can check the power by |
Hi @RuoyiDu Nvidia-smi says 190 average W. And Board Draw Power (power) the same. Peaking 23.6 GB VRAM inferencing a 2048x2048 image. The part where it takes eternal time is phase 2 decoding (the previous are fast). I don't know if this is because some dependencies but if other users with RTX 3090 can test it will be awesome. I never got with this pipeline constant 350W of BDP. |
Hi @davizca, on my server, it takes about 80s under full load. I'll try to optimise the speed of the decoding. But it looks like there are other reasons here for it being especially slow at your end. Let's see if anyone else in the community is experiencing similar issues. |
Thanks @siraxe! But it's still much slower than on my machine... It seems the decoder is quite slow on your PC, which makes it ridiculously slow when using tiled decoder. I will try to figure out the reason -- but it may be a little hard for me since I can't reproduce this issue on my end. BTW, I like your generation! Hope you can enjoy it! |
Was also seeing super slow times on my 4090. Set With a low batch size of 4, and multi-decoding set to true I was seeing hour long generation times. Down to 6 minutes now that I've fixed those! Hope this information is helpful. |
EI hi. Thanks everyone for checking into this. Currently I'm not at home but on Monday will try the fix. Its weird the difference in inferencing times of @RuoyiDu and the others... we will see Whats happening here ;) |
Hi guys @davizca @siraxe @Yggdrasil-Engineering, I find a little mistake at line #607: |
@RuoyiDu Phase 1 Denoising100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [00:39<00:00, 1.27it/s] Phase 1 Denoising100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [00:13<00:00, 3.58it/s] Phase 2 Denoising100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [02:56<00:00, 3.42s/it]### Phase 2 Decoding ### With multidecoder = False (same settings): |
Hi.
I'm using RTX 3090 GPU with 24 GB VRAM and I think there is something wrong.
Theorically it should be 3 minutes or so and nope.
Also posted on reddit
Cheers!
The text was updated successfully, but these errors were encountered: