Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc][Benchmarking] Enable benchmarks to create request from file #3530

Closed

Conversation

ElizaWszola
Copy link
Contributor

@ElizaWszola ElizaWszola commented Mar 20, 2024

Enable benchmarks that read a text file and send repetitive requests of it to the server. This way of benchmarking allows to obtain best case / upper bound scenario for request caching.

This PR was split from #3431 and used to create Sonnet dataset results.

To create requests from a text file, use arguments --dataset path/to/dataset --request-from-text. This will make the benchmark repeatedly send the text file to the server, resulting in best case caching scenario. Example:

python benchmarks/benchmark_serving.py  --model huggyllama/llama-7b --dataset benchmarks/data/sonnet.txt --request-rate 2.5 --num-prompts 1000 --backend openai --endpoint /v1/completions --request-from-text

Comment on lines +182 to +183
per_token_latencies.append(
(outputs[i].latency - outputs[i].ttft) / output_len)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change make sense?

Copy link
Member

@ywang96 ywang96 Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! In #3277 what I did was

if output_len > 1:
    tpots.append((outputs[i].latency - outputs[i].ttft) / (output_len - 1))

@ElizaWszola
Copy link
Contributor Author

I've been also thinking of making --dataset in --request-from-text point to a directory from which files would be randomly picked and sent as requests, but I don't know if there is a need for this kind of solution. The Sonnet test could then be reproduced by creating directory with a single sonnet.txt file in it.

@ywang96
Copy link
Member

ywang96 commented Mar 20, 2024

Hey @ElizaWszola! Thank you for the PR and we actually shared a lot of similar ideas! Some changes in this PR have been addressed (more accurate TPOT calculation, --dataset arg, etc) in an early PR of mine #3277 - could you please take a look at it so we can consolidate?

I also personally really like the idea of the data registry from #3431, but we can probably discuss it later when we really need it.

Comment on lines +123 to +127
system_message = {
"content": "You are a chatbot with the explicit goal of "
"helping the user as best as possible",
"role": "system",
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this actually won't work with Mixtral since it doesn't have a "system" role.

break

# Sample num_requests from the list.
assert dataset_args.num_samples <= len(dataset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it'll be not possible to run certain long-duration or high request-rate sessions if the dataset is too small, but we probably don't want to modify the dataset itself either.

Perhaps we should do sampling with replacements? (i.e. random.choices())

@ElizaWszola
Copy link
Contributor Author

@ywang96 Thanks for pointing this out! The benchmarking changes did travel from your PR to the ordered dict evictor testing to this one and I somehow missed it. I think it makes more sense to close this PR and work on yours, what do you think?

@ywang96
Copy link
Member

ywang96 commented Mar 21, 2024

@ywang96 Thanks for pointing this out! The benchmarking changes did travel from your PR to the ordered dict evictor testing to this one and I somehow missed it. I think it makes more sense to close this PR and work on yours, what do you think?

Yep of course! Feel free to review my PR, thanks!

@ElizaWszola
Copy link
Contributor Author

ElizaWszola commented Mar 21, 2024

@ywang96 Alright, I'm closing this now. Sorry for confusion on my side!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants