Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn shareGPT data into a standard benchmark #45

Closed
zhuohan123 opened this issue Apr 22, 2023 · 0 comments · Fixed by #145
Closed

Turn shareGPT data into a standard benchmark #45

zhuohan123 opened this issue Apr 22, 2023 · 0 comments · Fixed by #145
Assignees

Comments

@zhuohan123
Copy link
Member

  1. Extract out the lengths of the conversation rounds, and maybe have that data directly available from github.
  2. The current L-shape evaluation with binary search for throughput is hard to run and not scalable. We should find an easier way to benchmark the performance.
fxmarty pushed a commit to fxmarty/vllm-public that referenced this issue Jun 12, 2024
yukavio pushed a commit to yukavio/vllm that referenced this issue Jul 3, 2024
…t#45)

Tested by checking the help message in openai server:
```
python -m vllm.entrypoints.openai.api_server --help
```

Before:
```
  --sparsity {sparse_w16a16,None}, -s {sparse_w16a16,None}
                        Method used to compress sparse weights. If None, we first check the `sparsity_config`
                        attribute in the model config file. If that is None we assume the model weights are dense
 ```
 
 After:
```
--sparsity {None,sparse_w16a16,semi_structured_sparse_w16a16}, -s
{None,sparse_w16a16,semi_structured_sparse_w16a16}
Method used to compress sparse weights. If None, we first check the
`sparsity_config`
attribute in the model config file. If that is None we assume the model
weights are dense
```
@alixiaodi alixiaodi mentioned this issue Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant