Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Pipeline parallel support for Mixtral #6403

Closed
wants to merge 4 commits into from

Conversation

binxuan
Copy link

@binxuan binxuan commented Jul 13, 2024

Add pipeline support for Mixtral. This is an extension of a previous merged PR


Below are the tests we have performed:

Environment: two AWS P4D instances each with 8 A100.
Model checkpoint: Mixtral-8x22B-Instruct-v0.1 running with TP=8 and PP=2
Case 1:
Input: {"role": "user", "content": "Who won the world series in 2020?"}
Output: {"role":"assistant","content":" The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games. It was their first championship since 1988. The series was played at Globe Life Field in Arlington, Texas, due to the COVID-19 pandemic.","tool_calls":[]}

Case 2:
Input: {"role": "user", "content": "Who are you?"}
Output: {"role":"assistant","content":" I am an artificial intelligence assistant, designed to help answer questions, provide information, and assist with various tasks. I don't have personal experiences, emotions, or consciousness, but I can process and generate text based on the data I've been trained on.","tool_calls":[]}

Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only trigger fastcheck CI to run, which consists only a small and essential subset of tests to quickly catch errors with the flexibility to run extra individual tests on top (you can do this by unblocking test steps in the Buildkite run).

Full CI run is still required to merge this PR so once the PR is ready to go, please make sure to run it. If you need all test signals in between PR commits, you can trigger full CI as well.

To run full CI, you can do one of these:

  • Comment /ready on the PR
  • Add ready label to the PR
  • Enable auto-merge.

🚀

@youkaichao
Copy link
Member

can you fix the format, and also use the latest change of #6406 ?

@binxuan
Copy link
Author

binxuan commented Jul 15, 2024

can you fix the format, and also use the latest change of #6406 ?

Yeah sure, will use the format proposed in this PR.

@comaniac
Copy link
Collaborator

@binxuan I also need Mixtral PP support. If you don't have bandwidth, I could take over and file another PR with you as a co-author. Plz let me know. Thanks

@binxuan
Copy link
Author

binxuan commented Jul 17, 2024

@binxuan I also need Mixtral PP support. If you don't have bandwidth, I could take over and file another PR with you as a co-author. Plz let me know. Thanks

Sure, feel free to collaborate

@DarkLight1337
Copy link
Member

Closing as superseded by #6516

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants