-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU memory error in huge data #3
Comments
Thank you for your interest in our work and sorry for the delayed response. I think your Xenium dataset comprises 1,534,691 cells and 330 genes. This exceeds the current capacity of the SPACE model, which, due to its full-batch training regimen. I think a simple compromise strategy at the moment is to slice the data into regions of about 80,000 cells based on spatial location and train them separately. We have noticed the limitations of the SPACE model in terms of the number of cells it can handle, and we are working on a new model that can handle spatial transcriptome data of more than 10 million cells, and perhaps after a while we can make a public version available for people to try out. |
Thanks for replying. What do you think on modifying the learning method in the function in model.encode if we try to process the data without trimming it? I am not a familiar with libraries such as pytorch, etc., so I would appreciate your opinion on these ideas. Finally, thank you for your kind response to my vague question. |
I tnink it is feasible to incorporate a mini-batch strategy into the training of the SPACE model. I anticipate that the outcomes will not be substantially different. However, generating a suitable mini-batch may necessitate empirical testing, as it involves segmenting the entire graph into smaller subsets. |
Thanks for your fantastic work.
I am attempting to run space on huge Xenium data.
However, the input size is too large and the model.encode function in train.py is giving me an error in GPU memory allocation.
Error output:
I have 4 GPUs with 48GB memory in my runtime environment, which is not enough for this process.
The size of the tensor input to model.encode is 1534691✕330 for x and 2✕8804994 for edge_index.
I have tried to rewrite some of the scripts for this data, but without any knowledge of torch, I am not able to get it to work.
Any ideas would be appreciated.
The text was updated successfully, but these errors were encountered: