[Track] DeepSeek V3/R1 nextn progress #3472

zhyncs · 2025-02-10T14:46:03Z

zhyncs · 2025-02-17T13:13:53Z

ref
MTP support: #3582
v0.4.3.post1 release: #3638

SGLang supports MTP (nextn) in the Triton backend, achieving a speed of 77 tokens/s, twice as fast as other OSS LLM engines.

panpan0000 · 2025-02-18T11:59:56Z

Woo, Thank you @zhyncs.
just try new image lmsysorg/sglang:v0.4.3.post2-cu125
the performance seems similar than 0.4.2 (on 16 x H20)
when running-req = 1, the gen throughput (token/s) is no more than previous.

What did I missed ?

lambert0312 · 2025-02-21T05:46:53Z

I see compatible with radix cache and chunked prefill. How is it going?
Long context scenarios require this feature. @zhyncs

yukavio · 2025-02-21T10:48:26Z

The current Eagle has two issues:

It does not support chunked prefill.
The draft model follows the same distributed strategy as the target model.

Does the community have any plans to address these two issues?

zhyncs · 2025-02-21T11:40:50Z

@yukavio chunked prefill support is on the way @merrymercy

zhyncs added deepseek enhancement New feature or request flashinfer high priority labels Feb 10, 2025

zhyncs self-assigned this Feb 10, 2025

zhyncs pinned this issue Feb 10, 2025

zhyncs mentioned this issue Feb 10, 2025

[Feature] DeepSeek V3 optimization #2591

Open

18 tasks

zhyncs assigned merrymercy and ispobock Feb 21, 2025

zhyncs unpinned this issue Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Track] DeepSeek V3/R1 nextn progress #3472

[Track] DeepSeek V3/R1 nextn progress #3472

zhyncs commented Feb 10, 2025 •

edited

Loading

zhyncs commented Feb 17, 2025

panpan0000 commented Feb 18, 2025

lambert0312 commented Feb 21, 2025

yukavio commented Feb 21, 2025 •

edited

Loading

zhyncs commented Feb 21, 2025

[Track] DeepSeek V3/R1 nextn progress #3472

[Track] DeepSeek V3/R1 nextn progress #3472

Comments

zhyncs commented Feb 10, 2025 • edited Loading

Triton Backend

FlashInfer Backend

EAGLE 2

zhyncs commented Feb 17, 2025

panpan0000 commented Feb 18, 2025

lambert0312 commented Feb 21, 2025

yukavio commented Feb 21, 2025 • edited Loading

zhyncs commented Feb 21, 2025

zhyncs commented Feb 10, 2025 •

edited

Loading

yukavio commented Feb 21, 2025 •

edited

Loading