-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Properly check if all fused layers are in the list of targets
ready
ONLY add when PR is ready to merge/full CI is needed
#12666
opened Feb 2, 2025 by
eldarkurtic
Loading…
[Bugfix] Update Prometheus datasource configuration to use variable UID
#12659
opened Feb 2, 2025 by
liuyanyi
Loading…
[WIP] Hybrid allocator for full attention & sliding window attention interleaved models
needs-rebase
v1
#12655
opened Feb 2, 2025 by
heheda12345
•
Draft
[Frontend] support AWS SageMaker inference id
documentation
Improvements or additions to documentation
frontend
#12652
opened Feb 1, 2025 by
bmuskalla
Loading…
Fix: benchmark_prioritization.py has problems constructing requests w…
#12646
opened Feb 1, 2025 by
Accelerator1996
Loading…
[Core] BatchLLM for better shared prefix utilizing in offline scenarios
frontend
#12641
opened Feb 1, 2025 by
xinji1
Loading…
Fix get_device_name for cuda platforms that return bytes
#12636
opened Feb 1, 2025 by
mgoin
Loading…
[Core] choice-based structured output with xgrammar
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
structured-output
#12632
opened Jan 31, 2025 by
russellb
Loading…
[Build] update requirements of no-device for plugin usage
ci/build
#12630
opened Jan 31, 2025 by
sducouedic
Loading…
[CI] Fix flaky CI test
ci/build
speculative-decoding
#12626
opened Jan 31, 2025 by
NickLucche
Loading…
[Hardware][TPU] Multi-LoRA implementation for the TPU backend
#12623
opened Jan 31, 2025 by
Akshat-Tripathi
Loading…
[Core] Improve hash collision avoidance in prefix caching
needs-rebase
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#12621
opened Jan 31, 2025 by
russellb
Loading…
Adding cpu inference with VXE ISA for s390x architecture
ci/build
#12613
opened Jan 31, 2025 by
dilipgb
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-01-02.