Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about reproduing QVLM on LLaVA-v1.3-13B #9

Open
Wotoosh opened this issue Dec 26, 2024 · 6 comments
Open

Questions about reproduing QVLM on LLaVA-v1.3-13B #9

Wotoosh opened this issue Dec 26, 2024 · 6 comments

Comments

@Wotoosh
Copy link

Wotoosh commented Dec 26, 2024

Dear author,I used the ckpt in https://huggingface.co/liuhaotian/llava-llama-2-13b-chat-lightning-preview and try to reproduce the results on w4a4 settings by firstly using the generate_sqa_response.sh then using the evaluate_sqa_response.sh, however I only get
Total: 4241, Correct: 1737, Accuracy: 40.96% NATAccuracy: 43.21% SCOAccuracy: 29.92% LANAccuracy: 45.27% TXTAccuracy: 43.01% IMGAccuracy: 38.27% NOAccuracy: 43.62% G1Accuracy: 41.01% G7Accuracy: 40.87% instead of the 80.78% reported in the paper ,is there someting wrong in the ckpt I use? Besides,the calculation in the
bitsandbytes.quantization_utils.quant_utils.py
scale = n / (saturation_max - saturation_min)
causes overflow .So I change it into
inter=(saturation_max - saturation_min).to(torch.float32)
inter = torch.where(inter <= 1e-5, 1e-5,inter)
scale = n / inter is this what causes the gap? If so,how should I properly deal with the overflow?Many thanks

@ChangyuanWang17
Copy link
Owner

Sorry for the late reply. The model in the link is a model which is fine-tuned using the ScienceQA dataset: https://huggingface.co/liuhaotian/llava-lcs558k-scienceqa-vicuna-13b-v1.3

@Wotoosh
Copy link
Author

Wotoosh commented Dec 30, 2024

Dear autor ,after using the checkpoint in https://huggingface.co/liuhaotian/llava-lcs558k-scienceqa-vicuna-13b-v1.3,the speed has been decreased, the time for calibrating phase expands from less than a minute to more than ten minutes, and the second phase would cause more than 100 hours, I tried to change the use_cache in the config.json to True, but it didn't work,have you ever encounter anything like this? Many thanks

@Wotoosh
Copy link
Author

Wotoosh commented Dec 30, 2024

Dear autor ,after using the checkpoint in https://huggingface.co/liuhaotian/llava-lcs558k-scienceqa-vicuna-13b-v1.3,the speed has been decreased, the time for calibrating phase expands from less than a minute to more than ten minutes, and the second phase would cause more than 100 hours, I tried to change the use_cache in the config.json to True, but it didn't work,have you ever encounter anything like this? Many thanks

To be more specific, I have checked the position of model and its params, they have all been loaded into cuda. I have disenable the activation quantization process in bitsandbytes, but it didn't fix the problem. I move the safetensors in https://huggingface.co/liuhaotian/llava-lcs558k-scienceqa-vicuna-13b-v1.3 to the folder of the https://huggingface.co/liuhaotian/llava-llama-2-13b-chat-lightning-preview it becomes slow again, but I have no idea why it happens

@ChangyuanWang17
Copy link
Owner

After testing, the official llava-v1.3-13B model did not exhibit similar issues. You can check whether the model was downloaded completely and whether the model output is meaningful to make the assessment. Meanwhile, we have open-sourced the llava-v1.3-7B model that we finetuned on the ScienceQA dataset for testing! Hope this could work!

@ChangyuanWang17
Copy link
Owner

At the same time, when performing model testing, please pay attention to whether the regularization matching in the test file is correct!

pattern = re.compile(r'ANSWER: ([A-Z]).')

@Wotoosh
Copy link
Author

Wotoosh commented Jan 3, 2025

Many Thanks! After re-downloading every file in https://huggingface.co/liuhaotian/llava-lcs558k-scienceqa-vicuna-13b-v1.3 , the former problem disappeared .However, I noticed that the original official checkpoint was finetuned on "QCM-LEA" form prompt,since its name was "SQA-QCM-LEA-llava-13b-v1-3-pretrain_blip558k_plain-12e". Even after calirating with "QCM-LEPA" form prompt , it still prefer answers with the form of "pattern = re.compile(r'The answer is ([A-Z]).')", and by using this I can reproduce the accuracy reported in the paper. I tried the "pattern = re.compile(r'ANSWER: ([A-Z]).') ",but in this case , the accuracy drop to 40% again, which is nearly guessing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants