[BugFix] Fix chat API continuous usage stats #9357

njhill · 2024-10-15T03:00:48Z

completion_tokens in returned continuous usage stats was not cumulative as it should be.
completion_tokens in the final usage stats should be the total for all choices.
Includes some related code simplification/de-duplication.

This fixes an omission in the changes I made recently in the delta-streaming optimization #7381.

- completion_tokens in returned continuous usage stats was not cumulative as it should be. - completion_tokens in the final usage stats should be the total for all choices. - Includes some related code simplification/deduplication.

DarkLight1337

Entrypoints tests pass, so this should be good to go!

Similar to what was done for the chat API in vllm-project#9357. Ensure that the final chunk with usage data contains aggregate counts across all choices. Also simplify some of the prompt-handling logic in the API implementation.

Signed-off-by: charlifu <charlifu@amd.com>

Signed-off-by: Vinay Damodaran <vrdn@hey.com>

Signed-off-by: Alvant <alvasian@yandex.ru>

Signed-off-by: Amit Garg <mitgarg17495@gmail.com>

Signed-off-by: qishuai <ferdinandzhong@gmail.com>

Signed-off-by: Sumit Dubey <sumit.dubey2@ibm.com>

Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>

[BugFix] Fix chat API continuous usage stats

558e3a3

- completion_tokens in returned continuous usage stats was not cumulative as it should be. - completion_tokens in the final usage stats should be the total for all choices. - Includes some related code simplification/deduplication.

njhill force-pushed the fix-chat-stream-usage branch from fcb0d5e to 558e3a3 Compare October 15, 2024 03:33

njhill marked this pull request as ready for review October 15, 2024 03:34

njhill requested review from DarkLight1337, robertgshaw2-neuralmagic and simon-mo as code owners October 15, 2024 03:34

vllm-project deleted a comment from github-actions bot Oct 15, 2024

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 15, 2024

DarkLight1337 approved these changes Oct 15, 2024

View reviewed changes

simon-mo enabled auto-merge (squash) October 15, 2024 06:19

simon-mo disabled auto-merge October 15, 2024 06:19

simon-mo merged commit e9d517f into vllm-project:main Oct 15, 2024
60 of 66 checks passed

njhill deleted the fix-chat-stream-usage branch October 15, 2024 14:39

njhill mentioned this pull request Oct 17, 2024

[BugFix] Fix and simplify completion API usage streaming #9475

Merged

charlifu pushed a commit to charlifu/vllm that referenced this pull request Oct 23, 2024

[BugFix] Fix chat API continuous usage stats (vllm-project#9357)

7738411

Signed-off-by: charlifu <charlifu@amd.com>

vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Oct 23, 2024

[BugFix] Fix chat API continuous usage stats (vllm-project#9357)

a9dd2c4

Signed-off-by: Vinay Damodaran <vrdn@hey.com>

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[BugFix] Fix chat API continuous usage stats (vllm-project#9357)

35714af

Signed-off-by: Alvant <alvasian@yandex.ru>

garg-amit pushed a commit to garg-amit/vllm that referenced this pull request Oct 28, 2024

[BugFix] Fix chat API continuous usage stats (vllm-project#9357)

3c37653

Signed-off-by: Amit Garg <mitgarg17495@gmail.com>

FerdinandZhong pushed a commit to FerdinandZhong/vllm that referenced this pull request Oct 29, 2024

[BugFix] Fix chat API continuous usage stats (vllm-project#9357)

db18e4c

Signed-off-by: qishuai <ferdinandzhong@gmail.com>

sumitd2 pushed a commit to sumitd2/vllm that referenced this pull request Nov 14, 2024

[BugFix] Fix chat API continuous usage stats (vllm-project#9357)

75f0a1e

Signed-off-by: Sumit Dubey <sumit.dubey2@ibm.com>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[BugFix] Fix chat API continuous usage stats (vllm-project#9357)

20acb1a

mfournioux pushed a commit to mfournioux/vllm that referenced this pull request Nov 20, 2024

[BugFix] Fix chat API continuous usage stats (vllm-project#9357)

a754e52

Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Fix chat API continuous usage stats #9357

[BugFix] Fix chat API continuous usage stats #9357

njhill commented Oct 15, 2024 •

edited

Loading

DarkLight1337 left a comment

[BugFix] Fix chat API continuous usage stats #9357

[BugFix] Fix chat API continuous usage stats #9357

Conversation

njhill commented Oct 15, 2024 • edited Loading

DarkLight1337 left a comment

Choose a reason for hiding this comment

njhill commented Oct 15, 2024 •

edited

Loading