vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.4k
Star 35.9k

Code
Issues 1.2k
Pull requests 475
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 57 Milestones 0

New pull request New

475 Open 5,524 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Properly check if all fused layers are in the list of targets ready

ONLY add when PR is ready to merge/full CI is needed

#12666 opened Feb 2, 2025 by eldarkurtic

Loading…

Fix broken cmake on AMD platform ci/build

#12665 opened Feb 2, 2025 by kagurazakasanae

Loading…

[AMD][ROCm] Enable DeepSeek model on ROCm

#12662 opened Feb 2, 2025 by hongxiayang

Loading…

[VLM] Implement merged multimodal processor and V1 support for idefics3

#12660 opened Feb 2, 2025 by Isotr0py • Draft

2 of 4 tasks

[Bugfix] Update Prometheus datasource configuration to use variable UID

#12659 opened Feb 2, 2025 by liuyanyi

Loading…

[WIP] Hybrid allocator for full attention & sliding window attention interleaved models needs-rebase v1

#12655 opened Feb 2, 2025 by heheda12345 • Draft

[WIP][Attention] KV Splits heuristic for MLA

#12654 opened Feb 2, 2025 by LucasWilkinson • Draft

[Frontend] support AWS SageMaker inference id documentation

Improvements or additions to documentation

frontend

#12652 opened Feb 1, 2025 by bmuskalla

Loading…

Fix: benchmark_prioritization.py has problems constructing requests w…

#12646 opened Feb 1, 2025 by Accelerator1996

Loading…

[V1][Metrics] Add several request timing histograms ready

ONLY add when PR is ready to merge/full CI is needed

#12644 opened Feb 1, 2025 by markmc • Draft

[Core] BatchLLM for better shared prefix utilizing in offline scenarios frontend

#12641 opened Feb 1, 2025 by xinji1

Loading…

[Hardware][Metal] Apple Metal support ci/build

#12640 opened Feb 1, 2025 by skyzh • Draft

[WIP][Attention] WIP MLA with chunked prefill

#12639 opened Feb 1, 2025 by LucasWilkinson • Draft

Fix get_device_name for cuda platforms that return bytes

#12636 opened Feb 1, 2025 by mgoin

Loading…

[Model][Quant] Fix GLM, Fix fused module mappings for quantization

#12634 opened Jan 31, 2025 by kylesayrs • Draft

[Core] choice-based structured output with xgrammar ci/build ready

ONLY add when PR is ready to merge/full CI is needed

structured-output

#12632 opened Jan 31, 2025 by russellb

Loading…

[Build] update requirements of no-device for plugin usage ci/build

#12630 opened Jan 31, 2025 by sducouedic

Loading…

[Core] Add Additional Metrics to vLLM Server

#12627 opened Jan 31, 2025 by sahelib25

Loading…

[CI] Fix flaky CI test ci/build speculative-decoding

#12626 opened Jan 31, 2025 by NickLucche

Loading…

[ROCm] Using a more precise memory profiling

#12624 opened Jan 31, 2025 by gshtras

Loading…

[Hardware][TPU] Multi-LoRA implementation for the TPU backend

#12623 opened Jan 31, 2025 by Akshat-Tripathi

Loading…

Default to generation_config from model frontend

#12622 opened Jan 31, 2025 by hmellor

Loading…

[Core] Improve hash collision avoidance in prefix caching needs-rebase ready

ONLY add when PR is ready to merge/full CI is needed

#12621 opened Jan 31, 2025 by russellb

Loading…

Adding cpu inference with VXE ISA for s390x architecture ci/build

#12613 opened Jan 31, 2025 by dilipgb

Loading…

Fix quark fp8 format loading

#12612 opened Jan 31, 2025 by fxmarty-amd

Loading…

Previous 1 2 3 4 5 … 18 19 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2025-01-02.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly