torch.cuda.OutOfMemoryError: CUDA out of memory. #2

ganlinganlin · 2023-11-30T08:51:47Z

Hello, my friend
During the training of LLAMA2-13b on an A30 GPU equipped with 24GB of video memory, I am facing an error concerning GPU memory allocation. Are there any feasible solutions or code modifications that can resolve this issue?

error：torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 23.50 GiB total capacity; 23.16 GiB already allocated; 2.81 MiB free; 23.16 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Thanks!

LHRLAB · 2023-11-30T14:46:05Z

Our recommended setup includes an A40 (48GB) GPU for training and inference. If the CUDA memory is insufficient, you can switch to smaller models like Llama-2-7B, or reduce the batch size, among other adjustments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch.cuda.OutOfMemoryError: CUDA out of memory. #2

torch.cuda.OutOfMemoryError: CUDA out of memory. #2

ganlinganlin commented Nov 30, 2023 •

edited

Loading

LHRLAB commented Nov 30, 2023

torch.cuda.OutOfMemoryError: CUDA out of memory. #2

torch.cuda.OutOfMemoryError: CUDA out of memory. #2

Comments

ganlinganlin commented Nov 30, 2023 • edited Loading

LHRLAB commented Nov 30, 2023

ganlinganlin commented Nov 30, 2023 •

edited

Loading