Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LoRA support for Gemma #3050

Merged
merged 13 commits into from
Feb 28, 2024
Merged

Add LoRA support for Gemma #3050

merged 13 commits into from
Feb 28, 2024

Conversation

WoosukKwon
Copy link
Collaborator

Closes #3044

@WoosukKwon WoosukKwon requested a review from Yard1 February 27, 2024 07:00
@Yard1
Copy link
Collaborator

Yard1 commented Feb 27, 2024

Looks good, can we extend the test?

@WoosukKwon
Copy link
Collaborator Author

@Yard1 Added. PTAL!

@WoosukKwon
Copy link
Collaborator Author

@Yard1 The PR continues to fail in test_llama_lora_warmup while this PR seems to be irrelevant to it. Do you have any idea about this? I tried to debug but didn't succeed.

@Yard1
Copy link
Collaborator

Yard1 commented Feb 28, 2024

Hmm running the entire thing as a new process should help with cleanup. Or maybe try --forked?

@WoosukKwon
Copy link
Collaborator Author

@Yard1 Adding --forked indeed resolved the issue but increased the lora test time from 13 mins to 37 mins.

@WoosukKwon
Copy link
Collaborator Author

Let's merge the PR and optimize the testing time later. I thin this is ok since the lora test is not required for most PRs and the kernel test (which also uses --forked) takes a similar amount of time.

@WoosukKwon WoosukKwon enabled auto-merge (squash) February 28, 2024 21:03
@WoosukKwon WoosukKwon disabled auto-merge February 28, 2024 21:03
@WoosukKwon WoosukKwon merged commit 929b4f2 into main Feb 28, 2024
20 of 22 checks passed
@WoosukKwon WoosukKwon deleted the gemma-lora branch February 28, 2024 21:03
@zhaotyer
Copy link
Contributor

Gemma "vocab_size" is 256000, Will the following restrictions have any impact? Should they be removed?
vllm/lora/layers.py

# Keep this in sync with csrc/punica/bgmv/bgmv_config.h
if 32000 < self.base_layer.vocab_size > 33024:
       raise ValueError(
                "When using LoRA, vocab size must be 32000 >= vocab_size <= 33024"
            )

@Yard1
Copy link
Collaborator

Yard1 commented Feb 29, 2024

We are not supporting lm_head deltas for gemma

@zhaotyer
Copy link
Contributor

zhaotyer commented Mar 4, 2024

We are not supporting lm_head deltas for gemma

Thank you for your reply @Yard1

            new_module = replace_submodule(
                            self.model, module_name,
                            from_layer(module, self.lora_slots, self.lora_config,
                                       self.model.config))
            # (yard1): TODO make this more robust
            if "lm_head" in module_name:
                sampler_module = self.model.get_submodule("sampler")
                new_module = replace_submodule(
                    self.model, "sampler",
                    from_layer_sampler(sampler_module, module, self.lora_slots,
                                       self.lora_config, self.model.config))

Vocal_size verification will only be performed when lm_head is also processed by lora, but please take a look at this question. #3000

xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support LoRA for Gemma
3 participants