[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding
navigation perception summarization reasoning visual-reasoning egocentric-videos gpt-4 multiple-choice-questions benchmark-dataset video-language-understanding multimodal-large-language-models evals gemini-pro spatial-intelligence neurips-2024 1-hour-video-language-understanding long-form-video-language-understanding long-context-understanding
-
Updated
Dec 2, 2024 - Jupyter Notebook