Skip to content
This repository was archived by the owner on Oct 11, 2024. It is now read-only.

Add NM benchmarking scripts & utils #14

Merged
merged 75 commits into from
Feb 22, 2024

Conversation

varun-sundar-rabindranath
Copy link

@varun-sundar-rabindranath varun-sundar-rabindranath commented Feb 15, 2024

Summary:
Add benchmarking scripts and utils.
Things to note :

  • All files are stored in neuralmagic folder.
  • neuralmagic/benchmarks/scripts/* : Actual benchmarking scripts that interact with vllm engine.
  • neuralmagic/benchmarks/configs/* : JSON config files that define what benchmark commands to run.
  • neuralmagic/benchmarks/run_*.py : Scripts that consume some config file and run the benchmark scripts.
  • neuralmagic/tools : Add tools

Testing:
Local testing

@varun-sundar-rabindranath varun-sundar-rabindranath marked this pull request as ready for review February 16, 2024 21:57
@robertgshaw2-redhat
Copy link
Collaborator

robertgshaw2-redhat commented Feb 19, 2024

Thanks Varun. In general this is looking good. I'm going to leave some more comments in a bit.

The one thing that is a bit tricky here is that the dataset processing is it a bit too tied into the server benchmarking, which makes it a bit tricky to support swapping in and out the datasets we are benchmarking (which we will do over time). For example, we will want a different request pattern for prefix caching performance benchmarking than general

I'm going to put up a PR with an idea for how to make this a bit more pluggable

@varun-sundar-rabindranath
Copy link
Author

Thanks @robertgshaw2-neuralmagic

The one thing that is a bit tricky here is that the dataset processing is it a bit too tied into the server benchmarking, which makes it a bit tricky to support swapping in and out the datasets we are benchmarking (which we will do over time).

I made a recent refactor that moves the dataset related stuff into neuralmagic-vllm/neuralmagic/benchmarks/scripts/common.py, so it should be a little more organized now. But, I agree with the sentiment there are some kinks with the dataset stuff. Happy to talk about it .

@robertgshaw2-redhat
Copy link
Collaborator

@varun-sundar-rabindranath

I completely hacked the relative import system to do a simple proof of concept of how the dataset_registry could work. https://github.com/neuralmagic/neuralmagic-vllm/pull/30/files#diff-dca4e725ece41a665c0423924d56c905ce4c00188d79bc5dbeacb222c0ae6c6a

Idea was to have a programatic way to add new datasets. Thus, the get_sharegpt and get_ultrachat functions are responsible both for downloading and preprocessing the data.

  • This structure removes the need for a dataset_download_cmds functionality as this can now be handled programatically within the dataset registry.
  • This structure separates the preprocessing functionality from the benchmarking scripts. Since each dataset will have a different format, this is crucial

This should make it easy to add new datasets over time.

@@ -0,0 +1,178 @@
"""
Common functions used in all benchmarking scripts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we want to point out these are for our benchmarking scripts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we are going down this rabbit whole again.

Copy link
Member

@andy-neuma andy-neuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the "meetup". looks good.

@varun-sundar-rabindranath varun-sundar-rabindranath changed the title Varun/nm benchmarks Add NM Benchmarking Feb 22, 2024
@varun-sundar-rabindranath varun-sundar-rabindranath changed the title Add NM Benchmarking Add NM benchmarking scripts & utils Feb 22, 2024
@varun-sundar-rabindranath varun-sundar-rabindranath merged commit 77928e0 into main Feb 22, 2024
2 checks passed
@varun-sundar-rabindranath varun-sundar-rabindranath deleted the varun/nm-benchmarks branch February 22, 2024 22:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants