[Forge 0.0.16] NeverOOM Checkbox #395
Replies: 4 comments 5 replies
-
Besides recently we have also added a few cmd flags to tune the performance. The below texts are taken from Readme:
|
Beta Was this translation helpful? Give feedback.
-
How can we change tile size with this method ? (especially vae tile size) It seems to be using a very small size and with my tests on other gui's I found that I can enter a higher size without getting memory errors andget faster vae decoding. On comfyui with tiled vae 512 and 768 is around the same speed but if I use 960 (which was also the highest I can generate with sd1.5) the gen time speeds up significantly without oom. |
Beta Was this translation helpful? Give feedback.
-
我发现这个NeverOOM的有个bug,就是经常在图生图模式下报错,而在文生图模式下倒没什么问题: 我跑的图比较大,主要是vae解码的时候用的显存特别多,用这个可以节省很多显存,加快解码速度。特别需要这个。 |
Beta Was this translation helpful? Give feedback.
-
Does the use of reduced VRAM with NeverOOM affect the speed of generating images? |
Beta Was this translation helpful? Give feedback.
-
These are some two QoL tricks for making large images.
If you use
Enabled for VAE (always tiled)
you will always use tiled VAE to encode/decode images.If you use
Enabled for UNet (always maximize offload)
, the diffusion GPU memory will drop to smaller than 1.5GB for SDXL at 1024x1024 (and even smaller for SD1.5). For example, this is me generating 1024x1024 image using SDXL base model withEnabled for UNet
checkedYou can see that this GPU memory is always less than 1.5GB even for SDXL at 1024px.
Using this you can make use of all your remaining memory to get super big images.
By using
Enabled for UNet (always maximize offload)
, I am able to generate images at 6553x6553 using 8GB vram with SDXL. Previously this would need multi-diffusion to sacrifice result quality a bit but now you can generate at that resolution natively in one single pass.Below is screenshots with 8GB vram diffuse at 6553x6553 natively using SDXL
Beta Was this translation helpful? Give feedback.
All reactions