Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- **[Bugfix] Fix score api for missing max_model_len validation (vllm-project#12119)** - **[Bugfix] Mistral tokenizer encode accept list of str (vllm-project#12149)** - **[AMD][FP8] Using MI300 FP8 format on ROCm for block_quant (vllm-project#12134)** - **[torch.compile] disable logging when cache is disabled (vllm-project#12043)** - **[misc] fix cross-node TP (vllm-project#12166)** - **[AMD][CI/Build][Bugfix] use pytorch stale wheel (vllm-project#12172)** - **[core] further polish memory profiling (vllm-project#12126)** - **[Docs] Fix broken link in SECURITY.md (vllm-project#12175)** - **[Model] Port deepseek-vl2 processor, remove dependency (vllm-project#12169)** - **[core] clean up executor class hierarchy between v1 and v0 (vllm-project#12171)** - **[Misc] Support register quantization method out-of-tree (vllm-project#11969)** - **[V1] Collect env var for usage stats (vllm-project#12115)** - **[BUGFIX] Move scores to float32 in case of running xgrammar on cpu (vllm-project#12152)** - **[Bugfix] Fix multi-modal processors for transformers 4.48 (vllm-project#12187)** - **[torch.compile] store inductor compiled Python file (vllm-project#12182)** - **benchmark_serving support --served-model-name param (vllm-project#12109)** - **[Misc] Add BNB support to GLM4-V model (vllm-project#12184)** - **[V1] Add V1 support of Qwen2-VL (vllm-project#12128)** - **[Model] Support for fairseq2 Llama (vllm-project#11442)** - **[Bugfix] Fix num_heads value for simple connector when tp enabled (vllm-project#12074)** - **[torch.compile] fix sym_tensor_indices (vllm-project#12191)** - **Move linting to `pre-commit` (vllm-project#11975)** - **[DOC] Fix typo in docstring and assert message (vllm-project#12194)** - **[DOC] Add missing docstring in LLMEngine.add_request() (vllm-project#12195)** - **[Bugfix] Fix incorrect types in LayerwiseProfileResults (vllm-project#12196)** - **[Model] Add Qwen2 PRM model support (vllm-project#12202)** - **[Core] Interface for accessing model from `VllmRunner` (vllm-project#10353)** - **[misc] add placeholder format.sh (vllm-project#12206)** - **[CI/Build] Remove dummy CI steps (vllm-project#12208)** - **[CI/Build] Make pre-commit faster (vllm-project#12212)** - **[Model] Upgrade Aria to transformers 4.48 (vllm-project#12203)** - **[misc] print a message to suggest how to bypass commit hooks (vllm-project#12217)** - **[core][bugfix] configure env var during import vllm (vllm-project#12209)** - **[V1] Remove `_get_cache_block_size` (vllm-project#12214)** - **[Misc] Pass `attention` to impl backend (vllm-project#12218)** - **[Bugfix] Fix `HfExampleModels.find_hf_info` (vllm-project#12223)** - **[CI] Pass local python version explicitly to pre-commit mypy.sh (vllm-project#12224)** - **[Misc] Update CODEOWNERS (vllm-project#12229)** - **fix: update platform detection for M-series arm based MacBook processors (vllm-project#12227)** - **[misc] add cuda runtime version to usage data (vllm-project#12190)** - **[bugfix] catch xgrammar unsupported array constraints (vllm-project#12210)** - **[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3) (vllm-project#12222)** - **Add quantization and guided decoding CODEOWNERS (vllm-project#12228)** - **[AMD][Build] Porting dockerfiles from the ROCm/vllm fork (vllm-project#11777)** - **[BugFix] Fix GGUF tp>1 when vocab_size is not divisible by 64 (vllm-project#12230)** - **[ci/build] disable failed and flaky tests (vllm-project#12240)** - **[Misc] Rename `MultiModalInputsV2 -> MultiModalInputs` (vllm-project#12244)** - **[Misc]Add BNB quantization for PaliGemmaForConditionalGeneration (vllm-project#12237)** - **[Misc] Remove redundant TypeVar from base model (vllm-project#12248)** - **[Bugfix] Fix mm_limits access for merged multi-modal processor (vllm-project#12252)** --------- Signed-off-by: Wallas Santos <wallashss@ibm.com> Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: hongxyan <hongxyan@amd.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Michal Adamczyk <madamczyk@habana.ai> Signed-off-by: zibai <zibai.gj@alibaba-inc.com> Signed-off-by: Roger Wang <ywang@roblox.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Martin Gleize <mgleize@meta.com> Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: isikhi <huseyin.isik000@gmail.com> Signed-off-by: Jason Cheng <jasoncky96@gmail.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com> Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: NickLucche <nlucches@redhat.com> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Konrad Zawora <kzawora@habana.ai> Co-authored-by: Wallas Henrique <wallashss@users.noreply.github.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com> Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com> Co-authored-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: yancong <32220263+ice-tong@users.noreply.github.com> Co-authored-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Michal Adamczyk <madamczyk@habana.ai> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: gujing <925973396@qq.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: imkero <kerorek@outlook.com> Co-authored-by: Martin Gleize <mgleize@meta.com> Co-authored-by: mgleize user <mgleize@a100-st-p4de24xlarge-4.fair-a100.hpcaas> Co-authored-by: shangmingc <caishangming@linux.alibaba.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Yuan Tang <terrytangyuan@gmail.com> Co-authored-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Işık <41375111+isikhi@users.noreply.github.com> Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Cheng Kuan Yong Jason <jasoncky96@gmail.com> Co-authored-by: Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Michael Goin <michael@neuralmagic.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
- Loading branch information