-
-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA out of memory for 3k documents #247
Comments
Hello @aaraya-rr , same error here. I'm currently trying to create an index for around 128k documents. However I get the ProcedureMy documents are stored in an Elasticsearch index, and my goal is to migrate this index into RAGatouille. I’m using the However, around the 4th or 5th batch, the error occurs. I also observe a curious pattern in the indexing time, which increases between batches:
ConclusionsThe behavior I described leads me to conclude that it doesn’t matter if the indexing is done in batches; most likely, the previously indexed documents are also being preprocessed, as the time appears to follow the pattern t⋅b, where t is a time constant and b is the batch number. This could explain the memory issue. The processing may not be capable of splitting the workload effectively and tries to allocate the memory needed for all documents to be indexed. Additionally, using CPU indexing is not an option for me, as it takes approximately 5 hours per 1,000 documents, which I estimate would amount to about 26 days for 128,000 documents. :'( Potential fix (?)I would appreciate any suggestions or guidance on potential fixes for this issue. Is there a way to optimize memory usage during the indexing process, or should I consider alternative approaches to avoid the CUDA out of memory error? Probably related to #205 |
Similar to #205, I am experiencing an issue where CUDA runs out of memory when processing 3k documents (which are actually chunks, as I am using my own splitter).
I’ve noticed in your release notes (v0.0.8 and #173) that you mention adding documents in the range of 100k-500k, which makes me curious about how you achieve that without running out of memory, given that I’m facing memory issues on a T4 GPU with 15360MiB when processing just 3k documents.
What I find interesting is that when using
CUDA_VISIBLE_DEVICES=""
, the process works and takes a relatively short time (around 3 hours). I would like to know if you are still working on a solution for this, or if there is any way to prevent CUDA from running out of memory, as using only the CPU takes 3 hours, and with the GPU, the performance should improve significantly!My code:
Cuda Error:
The text was updated successfully, but these errors were encountered: