Skip to content

Commit

Permalink
[Doc] Compatibility matrix for mutual exclusive features (vllm-projec…
Browse files Browse the repository at this point in the history
…t#8512)

Signed-off-by: Wallas Santos <wallashss@ibm.com>
Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
  • Loading branch information
wallashss authored and mfournioux committed Nov 20, 2024
1 parent 4502b4e commit 803f245
Show file tree
Hide file tree
Showing 13 changed files with 467 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ Documentation
serving/usage_stats
serving/integrations
serving/tensorizer
serving/compatibility_matrix
serving/faq

.. toctree::
Expand Down
2 changes: 2 additions & 0 deletions docs/source/models/performance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ If you frequently encounter preemptions from the vLLM engine, consider the follo

You can also monitor the number of preemption requests through Prometheus metrics exposed by the vLLM. Additionally, you can log the cumulative number of preemption requests by setting disable_log_stats=False.

.. _chunked-prefill:

Chunked Prefill
---------------
vLLM supports an experimental feature chunked prefill. Chunked prefill allows to chunk large prefills into smaller chunks and batch them together with decode requests.
Expand Down
Loading

0 comments on commit 803f245

Please sign in to comment.