Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Vinoground to make evaluation consistent with paper #354

Merged
merged 3 commits into from
Oct 26, 2024

Conversation

HanSolo9682
Copy link
Contributor

As mentioned by Issue #350, the results of video score and group score on many models are inconsistent with the original paper. This is because in the lmms-eval code we initially provided, we did not use the shuffled questions like we did in the paper. We update the code to correctly reflect that and we are now able to reproduce results on llava-ov-qwen2-7b.

Screenshot 2024-10-25 at 15 56 13

@kcz358
Copy link
Collaborator

kcz358 commented Oct 26, 2024

Thank you for fixing the bugs in the issue! Merging this PR

@kcz358 kcz358 merged commit f255e5b into EvolvingLMMs-Lab:main Oct 26, 2024
1 check passed
ZhaoCinyu pushed a commit to ZhaoCinyu/lmms-eval that referenced this pull request Dec 9, 2024
…MMs-Lab#354)

* add vinoground

* make evaluation consistent to paper

---------

Co-authored-by: jzhang2427 <jzhang2427@wisc.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants