Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cuda graph capture failure possible solution #3430

Merged
merged 1 commit into from
Feb 9, 2025
Merged

add cuda graph capture failure possible solution #3430

merged 1 commit into from
Feb 9, 2025

Conversation

zhyncs
Copy link
Member

@zhyncs zhyncs commented Feb 9, 2025

Motivation

When using EAGLE 2 speculative decoding, it may fail to start up due to a CUDA graph capture failure. Consider reducing the CUDA graph max batch size (default is 160).

Modifications

Checklist

  • Format your code according to the Code Formatting with Pre-Commit.
  • Add unit tests as outlined in the Running Unit Tests.
  • Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
  • Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
  • For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.

@zhyncs zhyncs merged commit bc72e5b into main Feb 9, 2025
2 of 18 checks passed
@zhyncs zhyncs deleted the zhyncs/doc branch February 9, 2025 14:57
@zhyncs
Copy link
Member Author

zhyncs commented Feb 9, 2025

fix #3395

chongli-uw pushed a commit to chongli-uw/sglang that referenced this pull request Feb 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant