Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to allocate memory from 2nd GPU? #156

Open
aeon3 opened this issue Sep 8, 2022 · 16 comments
Open

How to allocate memory from 2nd GPU? #156

aeon3 opened this issue Sep 8, 2022 · 16 comments
Labels
enhancement New feature or request

Comments

@aeon3
Copy link

aeon3 commented Sep 8, 2022

Here the error I have run into:

"RuntimeError: CUDA out of memory. Tried to allocate 18.00 GiB (GPU 0; 24.00 GiB total capacity; 20.51 GiB already allocated; 618.87 MiB free; 20.59 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"

I have a 2nd GPU which could be used to allocate that extra 18GB, however I need help in figuring out how to show SD there is a 2nd GPU present.

Any thoughts?

@AUTOMATIC1111
Copy link
Owner

Using memory from between two GPUs is not simple. I only have one so I can't research/develop this.

@aeon3
Copy link
Author

aeon3 commented Sep 8, 2022

Using memory from between two GPUs is not simple. I only have one so I can't research/develop this.

Oh hi.
Well I have mine linked with Nvlink, I thought that would make it a breeze to benefit from memory pooling.
I guess it is not that different from having 2 unlinked GPUs afterall?

@dev-greene
Copy link

dev-greene commented Sep 9, 2022

Would be interested in this as well. I don't think something like SLI is the answer though.
Even distributing the batch or iterations across available GPUs.

@aeon3
Copy link
Author

aeon3 commented Sep 9, 2022

Found this guy talking about it here:
https://youtu.be/hBKcL8fNZ18?list=PLzSRtos7-PQRCskmdrgtMYIt_bKEbMPfD&t=481

Not sure if it's helpful or not but he shows some code

@mchaker
Copy link

mchaker commented Sep 23, 2022

This is the most intuitive and complete webui fork. It would be amazing if this could be implemented here:

NickLucche/stable-diffusion-nvidia-docker#8

Potential do double image output even with the same VRAM is awesome.

from #311

@mchaker
Copy link

mchaker commented Sep 23, 2022

For more than just 2 GPUs, NickLucche has code:

I imagine you're really busy with all the requests and bugs, but if you have 5 minutes, have a look at this file on Nickluche's project:

https://github.com/NickLucche/stable-diffusion-nvidia-docker/blob/master/parallel.py

He apparently generated an external wrapper to call the application, allowing it to query if there are or not multi-gpus, and in case there are, data parallel comes into play.

@dfaker dfaker added enhancement New feature or request dreams labels Sep 27, 2022
@NickLucche
Copy link

Hi! I could probably port this multi-gpu feature, but I would appreciate some pointers as to where in the code I should look for the actual model (I am using the vanilla one from huggingface).
Easiest mode would be implementing a ~data parallel approach, in which we have one model per GPU and you distribute the workload among them.
Given the amount of features this repo provides I think it could take some time to have em all supported in the parallel version.
Let me know your thoughts on this.

@swcrazyfan
Copy link

Hi! I could probably port this multi-gpu feature, but I would appreciate some pointers as to where in the code I should look for the actual model (I am using the vanilla one from huggingface).

Easiest mode would be implementing a ~data parallel approach, in which we have one model per GPU and you distribute the workload among them.

Given the amount of features this repo provides I think it could take some time to have em all supported in the parallel version.

Let me know your thoughts on this.

Is this something still in the works? I understand it could take a while to make everything support multiple GPU, but if I could use both of my GPU to generate images, that would be good enough. Like, if I select a batch of 2, each GPU would do one. If I did 8, each would do 4.

Is that complicated?

@Extraltodeus
Copy link
Contributor

@swcrazyfan you can already load two instances at the same time.
#3377

Just use --device-id 0 in one and --device-id 1 in the other.
Also --port some_port_number with a different port for each instance.

Of course it is not an optimal solution and you might need more RAM to run both instances. --lowram might help too.

@precompute
Copy link

Is this being worked upon? It sounds like an awesome feature. Even if it's restricted to txt2img, it'd be a start.

I guess this would require major changes to the way images are handled right now, there'd probably would need to be a queue of sorts to make this work.

@Lukium
Copy link

Lukium commented Nov 7, 2022

Hi! I could probably port this multi-gpu feature, but I would appreciate some pointers as to where in the code I should look for the actual model (I am using the vanilla one from huggingface). Easiest mode would be implementing a ~data parallel approach, in which we have one model per GPU and you distribute the workload among them. Given the amount of features this repo provides I think it could take some time to have em all supported in the parallel version. Let me know your thoughts on this.

I'd be happy to help test this if it's something that's being worked on. I'm currently running an 11x RTX3090 server for a Discord Community using @Extraltodeus 's --device-id feature #3377, and I think that having some parallelism would further benefit the community greatly. I'm not sure if it's ok to mention community links here, but info is in my profile, and you're welcome to DM me on Discord if it's something you would like help testing.

@Omegadarling
Copy link

Just popping in to check on this. I also have an 8x 3090 machine and a 2x3090 machine (both have 256GB RAM) that would be great for testing parallelization.

@mezotaken mezotaken removed the dreams label Jan 16, 2023
@zeigerpuppy
Copy link

This would be a really great feature. Just being able to distribute a batch would be great,

Having a round-robin for "next GPU" would also be useful to distribute web requests across a pool of GPUs.

@zeigerpuppy
Copy link

p.s. I think this issue has changed a bit from a memory question to a multi-GPU support question in general. It may be good to alter the title to something like: "Multi GPU support for parallel queries". I think that is somewhat distinct from the first query regarding memory pooling (which is a much more difficult ask!)

@hananbeer
Copy link
Contributor

Using memory from between two GPUs is not simple. I only have one so I can't research/develop this.

well let's get it funded then

@moxSedai
Copy link

I'm not sure this is really a parallel query question though, is it? I found it while looking for using multiple GPUs for a single query, and most of the discussion was based on that.

nne998 pushed a commit to fjteam/stable-diffusion-webui that referenced this issue Sep 26, 2023
Atry pushed a commit to Atry/stable-diffusion-webui that referenced this issue Jul 9, 2024
…ify AUTOMATIC1111#99 now returns the cheap_approx rather than grey image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests