Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] MBU in automated CI? #237

Open
cadedaniel opened this issue May 10, 2024 · 2 comments
Open

[Question] MBU in automated CI? #237

cadedaniel opened this issue May 10, 2024 · 2 comments

Comments

@cadedaniel
Copy link

cadedaniel commented May 10, 2024

Hi folks, thanks for the great work.

With #135 merged, vLLM could see benefit from torch.compile backend given compiler-native integration with PagedAttention kernels.

Is there an easy way to see what the latest/nightly MBU is for torch compile on say, H100 / Llama3 70B?

Also interested in cold start compile time

cc @msaroufim

@supriyar
Copy link
Contributor

@anijain2305 do we have any benchmark numbers for the cold start compile time?

@msaroufim
Copy link
Member

Related pytorch/pytorch#125958

yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
* clean up gguf loading.  Move model loading to meta.

* remove cpu

* Fix CI and validation scripts (pytorch#154)

* missing device (pytorch#232)

* Use generator args to group all arguments to generator (pytorch#231)

* prompt

* chat_mode, num_samples

* Move more generator args to use dataclass (pytorch#233)

* prompt

* chat_mode, num_samples

* move more args

* more gen args

* update

* args

* undo some changes

* typos

* Minor lint fixes (pytorch#236)

* remove redundancy & remove int4 linear test from ET tests (pytorch#237)

* remove redundancy

* no int4 linear on ET

* small changes

---------

Co-authored-by: Guang Yang <42389959+guangy10@users.noreply.github.com>
Co-authored-by: Michael Gschwind <61328285+mikekgfb@users.noreply.github.com>
Co-authored-by: Mergen Nachin <mnachin@meta.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants