File Limit Request: vllm - 400 MiB #3792

youkaichao · 2024-03-26T05:16:09Z

Project URL

https://pypi.org/project/vllm/

Does this project already exist?

Yes

New Limit

400

Update issue title

I have updated the title.

Which indexes

PyPI

About the project

vLLM is a fast and easy-to-use library for LLM inference and serving.

It plans to ship nvidia-nccl-cu12==2.18.3 within the package.

Reasons for the request

We identified nccl>=2.19 with a bug that largely increased GPU memory overhead, so we have to pin and ship nccl versions ourselves.

We cannot use pip install nvidia-nccl-cu12==2.18.3 because we depend on torch, which has binary dependency with pip install nvidia-nccl-cu12==2.19.5. So we are in a dependency hell, and we have to keep a nccl library ourselves.

vllm is a popular library for LLM inference, and it is used by many tech companies. Shipping nccl with vllm can increase its throughput and the quality of LLM serving. However, the downside is that the package wheel will become much larger. So we have to come here for support, to ask for a larger file size limit.

Code of Conduct

I agree to follow the PSF Code of Conduct

The text was updated successfully, but these errors were encountered:

youkaichao · 2024-03-29T22:02:34Z

bump up 👀

youkaichao · 2024-04-03T18:07:58Z

bump up 👀

mgoin · 2024-04-23T19:10:24Z

+1, it would be great to have this!

agt · 2024-04-23T23:41:15Z

From README.md

Large (more than 200MiB) upload limits are generally granted for the following reasons:

project contains large compiled binaries to maintain platform/architecture/GPU support

Project maintainers are having to limit or cut architecture/GPU/format support in order to fit <100mb:
vllm-project/vllm#4290
vllm-project/vllm#4304

zhuohan123 · 2024-04-24T07:03:13Z

Kindly cc @cmaureir for visibility. vLLM is the most popular open-source LLM serving engine in the world right now. Having a larger package limit can help us support more different types of hardware, and help democratize LLMs to the vast majority of developers.

WoosukKwon · 2024-04-24T07:14:46Z

Hi @cmaureir, I'm also a maintainer of vLLM. We do make our best effort to keep the binary size small, but it's increasingly difficult to meet the current limit since vLLM is rapidly growing with new features and optimizations that require new GPU kernels (binaries). Increasing the limit would be very helpful for the development of vLLM.

cmaureir · 2024-05-08T07:37:31Z

Hello @youkaichao 👋
I have set the new upload limit for vllm to 400M mainly to unlock your release processes, but I'm making a note that it's highly probable your project will reach the project limit soon because it's including an additional package. This is not encouraged, nor recommended.

Additionally, I see you have one package per-python version, which heavily increases the release total size, I recommend you to look into the Python Limited API in order to provide one-wheel per platform. https://docs.python.org/3/c-api/stable.html

Have a nice rest of the week 🚀

youkaichao · 2024-05-08T18:26:56Z

@cmaureir thanks for your support! We will try to see if we can build just one wheel for all python versions.

youkaichao · 2024-05-08T19:14:02Z

@cmaureir is it possible to build one wheel for all supported python version, when we have extensions? I find the wheel name always contains python version. Not sure how to build a Python-agnostic wheel.

youkaichao · 2024-05-08T20:21:48Z

I did a quick investigation:

To use Python Limited API in order to provide one-wheel per platform:

add flags to wheel building: python3 setup.py bdist_wheel --dist-dir=dist --py-limited-api=cp38
add macro during compilation: # define Py_LIMITED_API 0x30800000 (or can be set by extension arguments, c.f. https://stackoverflow.com/a/69073115/9191338 )

I tried, however, since we use pybind11, which does not support Python Limited API (c.f. pybind/pybind11#1755 ), we have to build one wheel for each Python version.

Sorry for the trouble :(

simon-mo · 2024-07-23T07:50:20Z

Hi @cmaureir, I would like to inquire the current total usage of vLLM packages and whether we can increase the project limit of 10GB. We have made quite some progress over the last few months. We are finally releasing version agnostic wheels.

youkaichao added the limit request label Mar 26, 2024

youkaichao mentioned this issue Apr 23, 2024

[Hardware][Nvidia] Enable support for Pascal GPUs vllm-project/vllm#4290

Closed

simon-mo mentioned this issue May 8, 2024

File size limit increase for vLLM pypi/warehouse#15922

Closed

youkaichao mentioned this issue May 8, 2024

What's the purpose of this repo? vllm-project/vllm-nccl#1

Closed

cmaureir closed this as completed May 8, 2024

youkaichao mentioned this issue May 8, 2024

[Feature]: bind python and c++ through tools other than pybind11 vllm-project/vllm#4694

Open

sasha0552 mentioned this issue May 10, 2024

[Hardware][Nvidia] Enable support for Pascal GPUs vllm-project/vllm#4409

Open

youkaichao mentioned this issue May 30, 2024

[CI/Build] increase wheel size limit to 200 MB vllm-project/vllm#5130

Merged

cmaureir self-assigned this Jun 24, 2024

youkaichao mentioned this issue Jul 23, 2024

[build] relax wheel size limit vllm-project/vllm#6704

Merged

simon-mo mentioned this issue Aug 5, 2024

Project Limit Request: vLLM - 30 GiB #4499

Closed

3 tasks

dirkson mentioned this issue Aug 12, 2024

[Bug]: Mistral 7B crashes on NVidia Tesla P100 with a CUDA Error vllm-project/vllm#5219

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File Limit Request: vllm - 400 MiB #3792

File Limit Request: vllm - 400 MiB #3792

youkaichao commented Mar 26, 2024

youkaichao commented Mar 29, 2024

youkaichao commented Apr 3, 2024

mgoin commented Apr 23, 2024

agt commented Apr 23, 2024

zhuohan123 commented Apr 24, 2024

WoosukKwon commented Apr 24, 2024 •

edited

Loading

cmaureir commented May 8, 2024

youkaichao commented May 8, 2024

youkaichao commented May 8, 2024

youkaichao commented May 8, 2024

simon-mo commented Jul 23, 2024

File Limit Request: vllm - 400 MiB #3792

File Limit Request: vllm - 400 MiB #3792

Comments

youkaichao commented Mar 26, 2024

Project URL

Does this project already exist?

New Limit

Update issue title

Which indexes

About the project

Reasons for the request

Code of Conduct

youkaichao commented Mar 29, 2024

youkaichao commented Apr 3, 2024

mgoin commented Apr 23, 2024

agt commented Apr 23, 2024

zhuohan123 commented Apr 24, 2024

WoosukKwon commented Apr 24, 2024 • edited Loading

cmaureir commented May 8, 2024

youkaichao commented May 8, 2024

youkaichao commented May 8, 2024

youkaichao commented May 8, 2024

simon-mo commented Jul 23, 2024

WoosukKwon commented Apr 24, 2024 •

edited

Loading