[Frontend][Feature] Add jamba tool parser #9154

tomeras91 · 2024-10-08T12:24:32Z

Add JambaToolParser to support tool calling for ai21labs/AI21-Jamba-1.5-Mini and ai21labs/AI21-Jamba-1.5-Large

…kip_special_tokens to False as done in Internlm2ToolParser

github-actions · 2024-10-08T12:24:45Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

This reverts commit 16542bc.

DarkLight1337 · 2024-10-09T09:37:41Z

@K-Mistele do you have time to review this?

K-Mistele · 2024-10-09T15:38:57Z

@K-Mistele do you have time to review this?

Yes, happy to take a look!

mgoin

This seems reasonable to me but I will wait on Kyle to review

mgoin · 2024-10-09T20:32:47Z

vllm/entrypoints/openai/tool_parsers/jamba_tool_parser.py

+                # TODO: edit comment
+                # use a regex to find the tool call between the tags


Is this TODO done?

Yes. Removed it now.
Thanks for catching this

…o add-jamba-tool-parser

tomeras91 · 2024-10-14T08:56:33Z

@K-Mistele do you have time to review this?

Yes, happy to take a look!

Ping? Anything I can do to help this getting merged?

K-Mistele

Overall looks good! just a couple thoughts:

Right now, tool use tests are already standardized across models in tests/tool_use, you can add a config for Jamba in vllm/tests/tool_use/utils.py so that the same tests that are run on other models are run on it. This is preferred since it keeps tool testing standardized to make sure that all models pass the same tests, rather than defining custom tests for each model
Please update the docs at docs/source/serving/openai_compatible_server.md with the relevant information for jamba - you'll see that Llama 3.1, Hermes, Mistral, and InternLM already have entries here, so you should be able to use the same template.

tomeras91 · 2024-10-15T22:06:41Z

@K-Mistele

RE tests - that was my initial thought as well. Problem is even Jamba-1.5-Mini is a pretty large model with ~52B params. It will take a long time to download every time the unittest is run. Moreover, even with quantization it needs an 80GB GPU, and AFAIU tests currently run on L4 GPUs.. so we'll have issues with memory as well. The Jamba-tiny-dev model used for other Jamba tests is a pretrained model and doesn't support tool calling.
Other than the model size issues, the end2end behavior of tool calling is already tested in tool calling tests of other models. Also, that end2end logic hasn't changed in this PR. The only logic this PR adds is the JambaToolParser logic. And conveniently enough (AKA good code design by you guys), it can be thoroughly tested without initializing a model, as I did in tests/tool_use/test_jamba_tool_parser.py. All it needs is the tokenizer and a raw string treated as the model output.
RE docs - sure! I'll update them soon

K-Mistele · 2024-10-16T18:36:14Z

@K-Mistele

RE tests - that was my initial thought as well. Problem is even Jamba-1.5-Mini is a pretty large model with ~52B params. It will take a long time to download every time the unittest is run. Moreover, even with quantization it needs an 80GB GPU, and AFAIU tests currently run on L4 GPUs.. so we'll have issues with memory as well. The Jamba-tiny-dev model used for other Jamba tests is a pretrained model and doesn't support tool calling.
Other than the model size issues, the end2end behavior of tool calling is already tested in tool calling tests of other models. Also, that end2end logic hasn't changed in this PR. The only logic this PR adds is the JambaToolParser logic. And conveniently enough (AKA good code design by you guys), it can be thoroughly tested without initializing a model, as I did in tests/tool_use/test_jamba_tool_parser.py. All it needs is the tokenizer and a raw string treated as the model output.

RE docs - sure! I'll update them soon

Sounds good! I didn't realize the model was so big. I will defer to @mgoin @DarkLight1337 then.

DarkLight1337

LGTM, thanks for your patience!

tomeras91 added 12 commits September 26, 2024 23:10

first working version of jamba tool parsing

cbd955a

lint and format

1a8c4e1

fix: We don't want to add content if it's an empty string

0da420d

add initial tests for jamba tool parser

310535c

reduce code duplication with use of parametrize

f5c9d09

fix model outputs to match jamba expected output

6b04e35

add tests for jamba tool parsing with streaming

c25cd51

Merge branch 'main' into add-jamba-tool-parser

d551be0

adjust JambaToolParser to changes in upstream

d31e688

Add adjust_request function to JambaToolParser since we need to set s…

6a27eb3

…kip_special_tokens to False as done in Internlm2ToolParser

update comments and remove unused code

bc16953

lint & format + adjust tests to new tool parser API

25d839d

tomeras91 added 2 commits October 9, 2024 09:24

dummy for build

16542bc

Revert "dummy for build"

2a25f10

This reverts commit 16542bc.

DarkLight1337 requested a review from mgoin October 9, 2024 09:38

DarkLight1337 added 2 commits October 9, 2024 15:02

Merge branch 'main' into add-jamba-tool-parser

a935865

Use vllm-project#9188 and improve validation

3c757c5

mgoin reviewed Oct 9, 2024

View reviewed changes

tomeras91 added 2 commits October 10, 2024 09:09

removed done TODO

0db1408

Merge branch 'add-jamba-tool-parser' of github.com:tomeras91/vllm int…

e5c2878

…o add-jamba-tool-parser

K-Mistele suggested changes Oct 15, 2024

View reviewed changes

tomeras91 and others added 2 commits October 17, 2024 23:09

Added Jamba tool calling to docs

20aeb6d

Apply vllm-project#9461

54efc40

DarkLight1337 approved these changes Oct 18, 2024

View reviewed changes

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 18, 2024

DarkLight1337 enabled auto-merge (squash) October 18, 2024 02:49

DarkLight1337 added 2 commits October 18, 2024 10:58

Trigger build with fix typo

ae9a0b7

Fix missing option

d5fefe9

DarkLight1337 merged commit d2b1bf5 into vllm-project:main Oct 18, 2024
55 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend][Feature] Add jamba tool parser #9154

[Frontend][Feature] Add jamba tool parser #9154

tomeras91 commented Oct 8, 2024

github-actions bot commented Oct 8, 2024

DarkLight1337 commented Oct 9, 2024

K-Mistele commented Oct 9, 2024

mgoin left a comment

mgoin Oct 9, 2024

tomeras91 Oct 10, 2024

tomeras91 commented Oct 14, 2024

K-Mistele left a comment

tomeras91 commented Oct 15, 2024 •

edited

Loading

K-Mistele commented Oct 16, 2024

DarkLight1337 left a comment

		# TODO: edit comment
		# use a regex to find the tool call between the tags

[Frontend][Feature] Add jamba tool parser #9154

[Frontend][Feature] Add jamba tool parser #9154

Conversation

tomeras91 commented Oct 8, 2024

github-actions bot commented Oct 8, 2024

DarkLight1337 commented Oct 9, 2024

K-Mistele commented Oct 9, 2024

mgoin left a comment

Choose a reason for hiding this comment

mgoin Oct 9, 2024

Choose a reason for hiding this comment

tomeras91 Oct 10, 2024

Choose a reason for hiding this comment

tomeras91 commented Oct 14, 2024

K-Mistele left a comment

Choose a reason for hiding this comment

tomeras91 commented Oct 15, 2024 • edited Loading

K-Mistele commented Oct 16, 2024

DarkLight1337 left a comment

Choose a reason for hiding this comment

tomeras91 commented Oct 15, 2024 •

edited

Loading