Skip to content

Latest commit

 

History

History
67 lines (52 loc) · 3.97 KB

README.md

File metadata and controls

67 lines (52 loc) · 3.97 KB

CMMMU

🌐 Homepage | 🤗 Paper | 📖 arXiv | 🤗 Dataset | 🏆 EvalAI | GitHub

This repo contains the evaluation code for the paper "CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark"

News

🔥[2024-04-26]: Thanks to the support of VLMEvalKit team, now everyone can use VLMEvalKit to easily conduct evaluations!

🔥[2024-03-14]: Thanks to the support of lmms-eval team, now everyone can use lmms-eval to easily conduct evaluations!

🌟[2024-03-06]: Our evaluation server for the test set is now available on EvalAI. We welcome all submissions and look forward to your participation! 😆

Introduction

CMMMU includes 12k manually collected multimodal questions from college exams, quizzes, and textbooks, covering six core disciplines: Art & Design, Business, Science, Health & Medicine, Humanities & Social Science, and Tech & Engineering, like its companion, MMMU. These questions span 30 subjects and comprise 39 highly heterogeneous image types, such as charts, diagrams, maps, tables, music sheets, and chemical structures.

Alt text

Evaluation

Please refer to our eval folder for more details.

🏆 Mini-Leaderboard

Model Val (900) Test (11K)
GPT-4V(ision) (Playground) 42.5 43.7
Qwen-VL-PLUS* 39.5 36.8
Yi-VL-34B 36.2 36.5
Yi-VL-6B 35.8 35.0
InternVL-Chat-V1.1* 34.7 34.0
Qwen-VL-7B-Chat 30.7 31.3
SPHINX-MoE* 29.3 29.5
InternVL-Chat-ViT-6B-Vicuna-7B 26.4 26.7
InternVL-Chat-ViT-6B-Vicuna-13B 27.4 26.1
CogAgent-Chat 24.6 23.6
Emu2-Chat 23.8 24.5
Chinese-LLaVA 25.5 23.4
VisCPM 25.2 22.7
mPLUG-OWL2 20.8 22.2
Frequent Choice 24.1 26.0
Random Choice 21.6 21.6

*: results provided by the authors.

Disclaimers

The guidelines for the annotators emphasized strict compliance with copyright and licensing rules from the initial data source, specifically avoiding materials from websites that forbid copying and redistribution. Should you encounter any data samples potentially breaching the copyright or licensing regulations of any site, we encourage you to contact us. Upon verification, such samples will be promptly removed.

Contact

Citation

BibTeX:

@article{zhang2024cmmmu,
        title={CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark},
        author={Ge, Zhang and Xinrun, Du and Bei, Chen and Yiming, Liang and Tongxu, Luo and Tianyu, Zheng and Kang, Zhu and Yuyang, Cheng and Chunpu, Xu and Shuyue, Guo and Haoran, Zhang and Xingwei, Qu and Junjie, Wang and Ruibin, Yuan and Yizhi, Li and Zekun, Wang and Yudong, Liu and Yu-Hsuan, Tsai and Fengji, Zhang and Chenghua, Lin and Wenhao, Huang and Jie, Fu},
        journal={arXiv preprint arXiv:2401.20847},
        year={2024},
      }