-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError: 'base_model.model.model.layers.0.self_attn.qkv_proj.lora_A.weight' #1625
Comments
Also facing this problem! File "/usr4/ec523/brucejia/.local/lib/python3.8/site-packages/vllm/entrypoints/llm.py", line 93, in __init__
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/usr4/ec523/brucejia/.local/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 231, in from_engine_args
engine = cls(*engine_configs,
File "/usr4/ec523/brucejia/.local/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 110, in __init__
self._init_workers(distributed_init_method)
File "/usr4/ec523/brucejia/.local/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 142, in _init_workers
self._run_workers(
File "/usr4/ec523/brucejia/.local/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 700, in _run_workers
output = executor(*args, **kwargs)
File "/usr4/ec523/brucejia/.local/lib/python3.8/site-packages/vllm/worker/worker.py", line 70, in init_model
self.model = get_model(self.model_config)
File "/usr4/ec523/brucejia/.local/lib/python3.8/site-packages/vllm/model_executor/model_loader.py", line 98, in get_model
model.load_weights(model_config.model, model_config.download_dir,
File "/usr4/ec523/brucejia/.local/lib/python3.8/site-packages/vllm/model_executor/models/llama.py", line 322, in load_weights
param = params_dict[name.replace(weight_name, param_name)]
KeyError: 'model.layers.0.self_attn.qkv_proj.base_layer.weight' |
@SuperBruceJia , did you found any other methods to tackle this error ? |
Yes! git clone --branch support_peft https://github.com/SuperBruceJia/vllm.git
cd vllm
pip install -e . --user Then, import gc
from vllm import LLM, SamplingParams
from vllm.model_executor.adapters import lora
from vllm.model_executor.parallel_utils.parallel_state import destroy_model_parallel
# Load the model
save_dir = "YOUR_PATH"
llm = LLM(model=model_name, download_dir=save_dir_llm, tensor_parallel_size=num_gpus, gpu_memory_utilization=0.70)
lora.LoRAModel.from_pretrained(llm.llm_engine.workers[0].model, save_dir)
# Delete the llm object and free the memory
destroy_model_parallel()
del llm
gc.collect()
torch.cuda.empty_cache()
torch.distributed.destroy_process_group()
print("Successfully delete the llm pipeline and free the GPU memory!") If you use some models that need Hugging Face login:
|
Generally speaking, you are suggested to use the solution mentioned here. Load the pre-trained model and merge the LoRA weights. |
While setting the dependencies by
note: This error originates from a subprocess, and is likely not a problem with pip. @SuperBruceJia , I want to use vllm for inferencing a fine-tuned Codellama base model .
Could you tell me how use vllm to infer Codellama . Your proposed solution is giving me above error . Any help is highly appreciated . |
"RuntimeError: Please note that the error is triggered by the CUDA version. You may get some help from this issue and solution. |
Please fix the problem of the CUDA version and PyTorch version. You may uninstall and re-install PyTorch. |
@SuperBruceJia , I fixed the CUDA version & PyTorch version issue . The problem was my PyTorch was having getting error NameError: name is not defined. I am already logged in my hugging face account . Then I load all the model files to my vm then try to use that folder where all files present , still got the error . How to fix this . Let say I want to infer model |
Hello everyone. Can I get help? I'm using inference on phi2 customed model and for inference I'm using MODAL and VLLM and I'm getting : base_model.model.model.layers.0.mlp.fc2.lora_A.weight @Soumendraprasad @SuperBruceJia |
The supported targets are only limited to target_modules=[
"q_proj",
"k_proj",
"v_proj",
], |
Hi, @SuperBruceJia I'm also getting this error: KeyError: 'base_model.model.lm_head.base_layer.weight' Can you please help? Here is my notebook: https://colab.research.google.com/drive/1hYdz4JYFuqzMM3pKFvsgH2ZMMc6KSy_y?usp=sharing and here is my model: https://huggingface.co/marksuccsmfewercoc/llava-1.5-7b-hf-ft-mix-vsft |
It seems that the repository only contains an adapter: You need to load the base model first: And then load the adapter: You may need the LoRA for vLLM: |
I'm confused, How that will work? Can you give me an example? |
Please check this simple example: https://docs.vllm.ai/en/latest/models/lora.html (1) You first load the based model |
Ok, so first I need to download the model from Huggingface hub and use llava model in the llm class and set enable lora to true |
Since there is a And your adapter's rank is So, I think it should work. |
Hey, @SuperBruceJia I'm getting this error: AssertionError: To be tested: vision language model with LoRA settings., I think it's not supported yet. |
Can we please have support for vision models? @ywang96 |
I think we can't use fine-tuned VLM on vllm? |
When I try to inference my finetuned code llama model using vllm, getting this error
File "/usr/local/lib/python3.9/dist-packages/vllm/engine/ray_utils.py", line 32, in execute_method return executor(*args, **kwargs) File "/usr/local/lib/python3.9/dist-packages/vllm/worker/worker.py", line 70, in init_model self.model = get_model(self.model_config) File "/usr/local/lib/python3.9/dist-packages/vllm/model_executor/model_loader.py", line 103, in get_model model.load_weights(model_config.model, model_config.download_dir, File "/usr/local/lib/python3.9/dist-packages/vllm/model_executor/models/llama.py", line 367, in load_weights param = state_dict[name.replace(weight_name, "qkv_proj")] KeyError: 'base_model.model.model.layers.0.self_attn.qkv_proj.lora_A.weight'
Some Shape Of my models
`
Any Suggestion or help is highly appreciated.
The text was updated successfully, but these errors were encountered: