Skip to content

Commit

Permalink
Rebase 2025.01.21 (#714)
Browse files Browse the repository at this point in the history
- **[Bugfix] Fix score api for missing max_model_len validation
(vllm-project#12119)**
- **[Bugfix] Mistral tokenizer encode accept list of str (vllm-project#12149)**
- **[AMD][FP8] Using MI300 FP8 format on ROCm for block_quant (vllm-project#12134)**
- **[torch.compile] disable logging when cache is disabled (vllm-project#12043)**
- **[misc] fix cross-node TP (vllm-project#12166)**
- **[AMD][CI/Build][Bugfix] use pytorch stale wheel (vllm-project#12172)**
- **[core] further polish memory profiling (vllm-project#12126)**
- **[Docs] Fix broken link in SECURITY.md (vllm-project#12175)**
- **[Model] Port deepseek-vl2 processor, remove dependency (vllm-project#12169)**
- **[core] clean up executor class hierarchy between v1 and v0
(vllm-project#12171)**
- **[Misc] Support register quantization method out-of-tree (vllm-project#11969)**
- **[V1] Collect env var for usage stats (vllm-project#12115)**
- **[BUGFIX] Move scores to float32 in case of running xgrammar on cpu
(vllm-project#12152)**
- **[Bugfix] Fix multi-modal processors for transformers 4.48 (vllm-project#12187)**
- **[torch.compile] store inductor compiled Python file (vllm-project#12182)**
- **benchmark_serving support --served-model-name param (vllm-project#12109)**
- **[Misc] Add BNB support to GLM4-V model (vllm-project#12184)**
- **[V1] Add V1 support of Qwen2-VL (vllm-project#12128)**
- **[Model] Support for fairseq2 Llama (vllm-project#11442)**
- **[Bugfix] Fix num_heads value for simple connector when tp enabled
(vllm-project#12074)**
- **[torch.compile] fix sym_tensor_indices (vllm-project#12191)**
- **Move linting to `pre-commit` (vllm-project#11975)**
- **[DOC] Fix typo in docstring and assert message (vllm-project#12194)**
- **[DOC] Add missing docstring in LLMEngine.add_request() (vllm-project#12195)**
- **[Bugfix] Fix incorrect types in LayerwiseProfileResults (vllm-project#12196)**
- **[Model] Add Qwen2 PRM model support (vllm-project#12202)**
- **[Core] Interface for accessing model from `VllmRunner` (vllm-project#10353)**
- **[misc] add placeholder format.sh (vllm-project#12206)**
- **[CI/Build] Remove dummy CI steps (vllm-project#12208)**
- **[CI/Build] Make pre-commit faster (vllm-project#12212)**
- **[Model] Upgrade Aria to transformers 4.48 (vllm-project#12203)**
- **[misc] print a message to suggest how to bypass commit hooks
(vllm-project#12217)**
- **[core][bugfix] configure env var during import vllm (vllm-project#12209)**
- **[V1] Remove `_get_cache_block_size` (vllm-project#12214)**
- **[Misc] Pass `attention` to impl backend (vllm-project#12218)**
- **[Bugfix] Fix `HfExampleModels.find_hf_info` (vllm-project#12223)**
- **[CI] Pass local python version explicitly to pre-commit mypy.sh
(vllm-project#12224)**
- **[Misc] Update CODEOWNERS (vllm-project#12229)**
- **fix: update platform detection for M-series arm based MacBook
processors (vllm-project#12227)**
- **[misc] add cuda runtime version to usage data (vllm-project#12190)**
- **[bugfix] catch xgrammar unsupported array constraints (vllm-project#12210)**
- **[Kernel] optimize moe_align_block_size for cuda graph and large
num_experts (e.g. DeepSeek-V3) (vllm-project#12222)**
- **Add quantization and guided decoding CODEOWNERS (vllm-project#12228)**
- **[AMD][Build] Porting dockerfiles from the ROCm/vllm fork (vllm-project#11777)**
- **[BugFix] Fix GGUF tp>1 when vocab_size is not divisible by 64
(vllm-project#12230)**
- **[ci/build] disable failed and flaky tests (vllm-project#12240)**
- **[Misc] Rename `MultiModalInputsV2 -> MultiModalInputs` (vllm-project#12244)**
- **[Misc]Add BNB quantization for PaliGemmaForConditionalGeneration
(vllm-project#12237)**
- **[Misc] Remove redundant TypeVar from base model (vllm-project#12248)**
- **[Bugfix] Fix mm_limits access for merged multi-modal processor
(vllm-project#12252)**

---------

Signed-off-by: Wallas Santos <wallashss@ibm.com>
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: hongxyan <hongxyan@amd.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
Signed-off-by: zibai <zibai.gj@alibaba-inc.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Martin Gleize <mgleize@meta.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: isikhi <huseyin.isik000@gmail.com>
Signed-off-by: Jason Cheng <jasoncky96@gmail.com>
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
Signed-off-by: mgoin <michael@neuralmagic.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Co-authored-by: Wallas Henrique <wallashss@users.noreply.github.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: yancong <32220263+ice-tong@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Michal Adamczyk <madamczyk@habana.ai>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: gujing <925973396@qq.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: imkero <kerorek@outlook.com>
Co-authored-by: Martin Gleize <mgleize@meta.com>
Co-authored-by: mgleize user <mgleize@a100-st-p4de24xlarge-4.fair-a100.hpcaas>
Co-authored-by: shangmingc <caishangming@linux.alibaba.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: Işık <41375111+isikhi@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cheng Kuan Yong Jason <jasoncky96@gmail.com>
Co-authored-by: Jinzhen Lin <linjinzhen@hotmail.com>
Co-authored-by: Michael Goin <mgoin@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
  • Loading branch information
1 parent 5424a93 commit c9db39b
Show file tree
Hide file tree
Showing 160 changed files with 3,693 additions and 3,559 deletions.
2 changes: 1 addition & 1 deletion .buildkite/nightly-benchmarks/scripts/nightly-annotate.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ main() {



# The figures should be genereated by a separate process outside the CI/CD pipeline
# The figures should be generated by a separate process outside the CI/CD pipeline

# # generate figures
# python3 -m pip install tabulate pandas matplotlib
Expand Down
9 changes: 6 additions & 3 deletions .buildkite/test-pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ steps:
- tests/worker
- tests/standalone_tests/lazy_torch_compile.py
commands:
- pip install git+https://github.com/Isotr0py/DeepSeek-VL2.git # Used by multimoda processing test
- python3 standalone_tests/lazy_torch_compile.py
- pytest -v -s mq_llm_engine # MQLLMEngine
- pytest -v -s async_engine # AsyncLLMEngine
Expand Down Expand Up @@ -478,7 +477,9 @@ steps:
- pytest models/encoder_decoder/language/test_bart.py -v -s -m 'distributed(num_gpus=2)'
- pytest models/encoder_decoder/vision_language/test_broadcast.py -v -s -m 'distributed(num_gpus=2)'
- pytest models/decoder_only/vision_language/test_models.py -v -s -m 'distributed(num_gpus=2)'
- pytest -v -s spec_decode/e2e/test_integration_dist_tp2.py
# this test fails consistently.
# TODO: investigate and fix
# - pytest -v -s spec_decode/e2e/test_integration_dist_tp2.py
- CUDA_VISIBLE_DEVICES=0,1 pytest -v -s test_sharded_state_loader.py
- CUDA_VISIBLE_DEVICES=0,1 pytest -v -s kv_transfer/disagg_test.py

Expand Down Expand Up @@ -516,7 +517,9 @@ steps:
- vllm/engine
- tests/multi_step
commands:
- pytest -v -s multi_step/test_correctness_async_llm.py
# this test is quite flaky
# TODO: investigate and fix.
# - pytest -v -s multi_step/test_correctness_async_llm.py
- pytest -v -s multi_step/test_correctness_llm.py

- label: Pipeline Parallelism Test # 45min
Expand Down
40 changes: 0 additions & 40 deletions .github/workflows/actionlint.yml

This file was deleted.

53 changes: 0 additions & 53 deletions .github/workflows/clang-format.yml

This file was deleted.

45 changes: 0 additions & 45 deletions .github/workflows/codespell.yml

This file was deleted.

32 changes: 0 additions & 32 deletions .github/workflows/doc-lint.yml

This file was deleted.

17 changes: 0 additions & 17 deletions .github/workflows/matchers/ruff.json

This file was deleted.

51 changes: 0 additions & 51 deletions .github/workflows/mypy.yaml

This file was deleted.

37 changes: 0 additions & 37 deletions .github/workflows/png-lint.yml

This file was deleted.

19 changes: 19 additions & 0 deletions .github/workflows/pre-commit.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: pre-commit

Check failure

Code scanning / Scorecard

Token-Permissions High

score is 0: no topLevel permission defined
Remediation tip: Visit https://app.stepsecurity.io/secureworkflow.
Tick the 'Restrict permissions for GITHUB_TOKEN'
Untick other options
NOTE: If you want to resolve multiple issues at once, you can visit https://app.stepsecurity.io/securerepo instead.
Click Remediation section below for further remediation help

on:
pull_request:
push:
branches: [main]

jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
with:
python-version: "3.12"
- run: echo "::add-matcher::.github/workflows/matchers/actionlint.json"
- uses: pre-commit/action@2c7b3805fd2a0fd8c1884dcaebf91fc102a13ecd # v3.0.1
with:
extra_args: --hook-stage manual
52 changes: 0 additions & 52 deletions .github/workflows/ruff.yml

This file was deleted.

Loading

0 comments on commit c9db39b

Please sign in to comment.