[Misc] Standardize RoPE handling for Qwen2-VL #9250

DarkLight1337 · 2024-10-10T14:38:31Z

huggingface/transformers#33401 has been fixed in Transformers v4.45.2, but the devs have clarified that M-ROPE is intended to be configured as rope_type="default". To avoid future compatibility issues, this PR updates the RoPE code in vLLM to follow their specification.

github-actions · 2024-10-10T14:38:45Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

DarkLight1337 · 2024-10-10T14:45:30Z

vllm/model_executor/layers/rotary_embedding.py

-        if scaling_type not in {"su", "longrope"}:
-            scaling_factor = rope_scaling.get("factor", 1.0)


It is not clear which RoPE implementations need this, so I've moved this code into the individual if blocks.

DarkLight1337 · 2024-10-11T15:50:24Z

@ywang96 PTAL when you have time.

ZhangYaoFu · 2024-10-12T07:31:38Z

Can mrope be completed using specialized CUDA operators? According to the profile results, this part is quite time-consuming.

DarkLight1337 · 2024-10-12T09:12:13Z

Can mrope be completed using specialized CUDA operators? According to the profile results, this part is quite time-consuming.

Can you open a new issue for this? Thanks

ywang96

Left two comments - PTAL!

vllm/config.py

Isotr0py

After testing on Qwen2-VL and Phi-3-medium, I'm fine with these changes since su and mrope are both handled with patching correctly.

Update LoRA handling for Qwen2-VL to conform to Transformers

71e5cd0

DarkLight1337 requested review from WoosukKwon and ywang96 October 10, 2024 14:38

Add note

992d4cd

DarkLight1337 force-pushed the qwen2vl-lora branch from 33a037e to 992d4cd Compare October 10, 2024 14:40

DarkLight1337 changed the title ~~[Misc] Update LoRA handling for Qwen2-VL to conform to Transformers~~ [Misc] Update RoPE handling for Qwen2-VL to conform to Transformers Oct 10, 2024

DarkLight1337 commented Oct 10, 2024

View reviewed changes

DarkLight1337 changed the title ~~[Misc] Update RoPE handling for Qwen2-VL to conform to Transformers~~ [Misc] Standardize RoPE handling for Qwen2-VL Oct 10, 2024

Move backwards compatibility

530d8a0

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 10, 2024

Tests may use outdated rope scalings, so we patch them as well

1e9605b

DarkLight1337 force-pushed the qwen2vl-lora branch from 5c91da3 to 1e9605b Compare October 11, 2024 07:08

ywang96 reviewed Oct 13, 2024

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

vllm/config.py Show resolved Hide resolved

DarkLight1337 added 3 commits October 13, 2024 05:24

Add note

2bd105a

Update internal usage of legacy "rope_scaling.type"

a6bbb7d

Update internal usage of rope_scaling.type == "mrope"

e9fad27

DarkLight1337 requested a review from tlrmchlsmth as a code owner October 13, 2024 05:39

DarkLight1337 added 7 commits October 13, 2024 05:40

Remove redundant patch

6672eb1

Fix

1ea4ad4

Fix failure in internvl test

9a2a88e

Merge branch 'main' into qwen2vl-lora

cf68bd7

Fix rope scaling key

d08a674

Update patch

27a7917

Merge branch 'main' into qwen2vl-lora

a563aba

Isotr0py approved these changes Oct 16, 2024

View reviewed changes

Isotr0py merged commit 7e7eae3 into main Oct 16, 2024
76 checks passed

DarkLight1337 deleted the qwen2vl-lora branch October 16, 2024 05:57

DarkLight1337 mentioned this pull request Oct 22, 2024

[Bug]: deploy Phi-3-mini-128k-instruct AssertionError #4784

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Standardize RoPE handling for Qwen2-VL #9250

[Misc] Standardize RoPE handling for Qwen2-VL #9250

DarkLight1337 commented Oct 10, 2024 •

edited

Loading

github-actions bot commented Oct 10, 2024

DarkLight1337 Oct 10, 2024

DarkLight1337 commented Oct 11, 2024

ZhangYaoFu commented Oct 12, 2024

DarkLight1337 commented Oct 12, 2024

ywang96 left a comment

Isotr0py left a comment

		if scaling_type not in {"su", "longrope"}:
		scaling_factor = rope_scaling.get("factor", 1.0)

[Misc] Standardize RoPE handling for Qwen2-VL #9250

[Misc] Standardize RoPE handling for Qwen2-VL #9250

Conversation

DarkLight1337 commented Oct 10, 2024 • edited Loading

github-actions bot commented Oct 10, 2024

DarkLight1337 Oct 10, 2024

Choose a reason for hiding this comment

DarkLight1337 commented Oct 11, 2024

ZhangYaoFu commented Oct 12, 2024

DarkLight1337 commented Oct 12, 2024

ywang96 left a comment

Choose a reason for hiding this comment

Isotr0py left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Oct 10, 2024 •

edited

Loading