Reduce GPU memory utilization to make sure OOM doesn't happen #153

zhuohan123 · 2023-06-18T09:01:43Z

Fix #143. It turns out OOM is caused by the fragmentation within PyTorch memory allocator.

WoosukKwon

Thanks!

…roject#153)

Summary: Update benchmark readme Test: None --------- Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>

…herry-pick-152-to-release [release] Add sample chat template into vLLM container

Reduce GPU memory utilization to make sure OOM doesn't happen

df4f6ce

zhuohan123 requested a review from WoosukKwon June 18, 2023 09:02

WoosukKwon approved these changes Jun 18, 2023

View reviewed changes

zhuohan123 merged commit bf5f121 into main Jun 18, 2023

zhuohan123 deleted the reduce-gpu-memory-utilization branch June 19, 2023 08:31

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Reduce GPU memory utilization to make sure OOM doesn't happen (vllm-p…

f557242

…roject#153)

sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024

Reduce GPU memory utilization to make sure OOM doesn't happen (vllm-p…

1dde832

…roject#153)

yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024

update readme with nvcc threads option (vllm-project#153)

25bd431

Summary: Update benchmark readme Test: None --------- Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>

dtrifiro pushed a commit to dtrifiro/vllm that referenced this pull request Sep 30, 2024

Merge pull request vllm-project#153 from openshift-cherrypick-robot/c…

1c36661

…herry-pick-152-to-release [release] Add sample chat template into vLLM container

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce GPU memory utilization to make sure OOM doesn't happen #153

Reduce GPU memory utilization to make sure OOM doesn't happen #153

zhuohan123 commented Jun 18, 2023 •

edited

Loading

WoosukKwon left a comment

Reduce GPU memory utilization to make sure OOM doesn't happen #153

Reduce GPU memory utilization to make sure OOM doesn't happen #153

Conversation

zhuohan123 commented Jun 18, 2023 • edited Loading

WoosukKwon left a comment

Choose a reason for hiding this comment

zhuohan123 commented Jun 18, 2023 •

edited

Loading