From 282dbff4f859fbabed69188b72d33ddc274d2a7f Mon Sep 17 00:00:00 2001 From: Wallas Santos Date: Thu, 5 Sep 2024 15:34:54 -0300 Subject: [PATCH] [Docs] Updated docs for compatibility matrix Signed-off-by: Wallas Santos --- docs/source/serving/compatibility_matrix.rst | 49 +++++++++----------- 1 file changed, 23 insertions(+), 26 deletions(-) diff --git a/docs/source/serving/compatibility_matrix.rst b/docs/source/serving/compatibility_matrix.rst index 333d148281822..491c1d46b3d7f 100644 --- a/docs/source/serving/compatibility_matrix.rst +++ b/docs/source/serving/compatibility_matrix.rst @@ -7,7 +7,7 @@ The table below shows mutually exclusive features along with support for some de .. list-table:: :header-rows: 1 - :widths: 20 8 8 8 8 8 8 8 8 + :widths: 20 8 8 8 8 8 8 8 8 8 * - Feature - Chunked Prefill @@ -17,7 +17,8 @@ The table below shows mutually exclusive features along with support for some de - Speculative decoding - CUDA Graphs - Encoder/Decoder - - Logprobs* + - Logprobs + - Prompt Logprobs * - APC - ✅ - @@ -29,7 +30,7 @@ The table below shows mutually exclusive features along with support for some de - - * - LoRa - - ❌ [[1C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/config.py#L1381)] + - ❌ `[C] `__ - ✅ - - @@ -49,9 +50,9 @@ The table below shows mutually exclusive features along with support for some de - - * - Speculative decoding - - ❌ [[2C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/config.py#L1024)] [[3T](https://github.com/vllm-project/vllm/issues/5016)] + - ❌ `[C] `__ `[T] `__ - ✅ - - ❌ [[4C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/spec_decode/spec_decode_worker.py#L85-L86)] + - ❌ `[C] `__ - ✅ - - @@ -69,12 +70,12 @@ The table below shows mutually exclusive features along with support for some de - - * - Encoder/Decoder - - ❌ [[5C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/worker/utils.py#L24)] - - ❌ [[6C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/worker/utils.py#L16)][[18T](https://github.com/vllm-project/vllm/issues/7366)] - - ❌ [[7C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/worker/utils.py#L35C1-L36C1)] - - ❌ [[8C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/worker/utils.py#L55)] - - ❌ [[9C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/worker/utils.py#L47)][[18T](https://github.com/vllm-project/vllm/issues/7366)] - - ❌ [[10C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/worker/utils.py#L51)][[19T](https://github.com/vllm-project/vllm/issues/7447)] + - ❌ `[C] `__ + - ❌ `[C] `__ `[T] `__ + - ❌ `[C] `__ + - ❌ `[C] `__ + - ❌ `[C] `__ `[T] `__ + - ❌ `[C] `__ `[T] `__ - - - @@ -83,7 +84,7 @@ The table below shows mutually exclusive features along with support for some de - ✅ - ✅ - ✅ - - ❌ [[11C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/engine/output_processor/multi_step.py#L52)] + - ❌ `[C] `__ - ✅ - ✅ - @@ -93,7 +94,7 @@ The table below shows mutually exclusive features along with support for some de - ✅ - ✅ - ✅ - - ❌ [[11C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/engine/output_processor/multi_step.py#L52)] + - ❌ `[C] `__ - ✅ - ✅ - ✅ @@ -109,13 +110,13 @@ The table below shows mutually exclusive features along with support for some de - ✅ - ✅ * - CPU - - ❌ [[12C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/executor/cpu_executor.py#L328)] - - ❌ [[13C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/executor/cpu_executor.py#L337)] - - ❌ [[14C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/executor/cpu_executor.py#L29)] + - ❌ `[C] `__ + - ❌ `[C] `__ + - ❌ `[C] `__ `[T] `__ - ? - ✅ - - ❌ [[15C](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/vllm/executor/cpu_executor.py#L318)] - - ❌ [[16](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/tests/models/test_bart.py#L8)] + - ❌ `[C] `__ + - ❌ `[C] `__ - ✅ - ✅ * - AMD @@ -125,15 +126,11 @@ The table below shows mutually exclusive features along with support for some de - ✅ - ✅ - ✅ - - ❌ [[17](https://github.com/vllm-project/vllm/blob/757ac70a64b5a643b68281c0b65f72f847cedbd6/tests/kernels/test_encoder_decoder_attn.py#L753)] + - ❌ `[C] `__ - ✅ - ✅ - + Note: -- Logprobs include the support for both output logbrobs and prompt logprobs. - -Related Issues: - -- Encoder/Decoder feature compatibility https://github.com/vllm-project/vllm/issues/7366 -- Speculative decoding with chunked prefill https://github.com/vllm-project/vllm/issues/5016 +[C] stands for code checks, that is, there is a checking on running that verify if the combinations is valid and raises and error or log a warning disabling the feature. +[T] stands for tracking issues or pull requests on vLLM Repo \ No newline at end of file