How to eval VQ2 and Blip2 on the winoground Metric? #11

HaozheZhao · 2023-11-21T06:43:07Z

Could you please explain how you precisely assess your methods and Blip2 using the Winoground metric?

yonatanbitton · 2023-11-21T10:10:35Z

Sure,
We follow exact Winoground's evaluation, you can see a very similar code here: https://github.com/yonatanbitton/CLIPEvaluation/blob/main/src/eval_winoground.py
The only difference is that instead of CLIP score, we used BLIP2 image-text score.
Please let me know if it's clearer now and if there are additional questions

Provide feedback