Skip to content
Change the repository type filter

All

    Repositories list

    • [CVPR'25] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering
      Python
      Apache License 2.0
      02040Updated Mar 26, 2025Mar 26, 2025
    • ReT

      Public
      Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval (CVPR 2025)
      Python
      01010Updated Mar 26, 2025Mar 26, 2025
    • LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning
      Python
      Apache License 2.0
      811611Updated Mar 25, 2025Mar 25, 2025
    • mammoth

      Public
      An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning
      Python
      MIT License
      11865050Updated Mar 25, 2025Mar 25, 2025
    • DitHub

      Public
      HTML
      Apache License 2.0
      0200Updated Mar 25, 2025Mar 25, 2025
    • HySAC

      Public
      Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025
      Python
      01110Updated Mar 25, 2025Mar 25, 2025
    • Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
      0300Updated Mar 23, 2025Mar 23, 2025
    • HWD

      Public
      Python
      Other
      12300Updated Mar 7, 2025Mar 7, 2025
    • VATr

      Public
      Python
      MIT License
      58030Updated Mar 7, 2025Mar 7, 2025
    • pacscore

      Public
      Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. (CVPR 2023)
      Python
      56030Updated Mar 5, 2025Mar 5, 2025
    • COGT

      Public
      Causal Graphical Models for Vision-Language Compositional Understanding (ICLR 2025)
      Python
      0600Updated Mar 5, 2025Mar 5, 2025
    • General Federated Continual Learning Framework
      Python
      0310Updated Mar 3, 2025Mar 3, 2025
    • cvcs2025

      Public
      0000Updated Feb 28, 2025Feb 28, 2025
    • LAM

      Public
      The Ludovico Antonio Muratori (LAM) dataset is the largest line-level HTR dataset to date and contains 25,823 lines from Italian ancient manuscripts edited by a single author over 60 years. The dataset comes in two configurations: a basic splitting and a date-based splitting which takes into account the age of the author. The first setting is in…
      0500Updated Feb 26, 2025Feb 26, 2025
    • CoDE

      Public
      [ECCV'24] Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
      Python
      MIT License
      03310Updated Feb 3, 2025Feb 3, 2025
    • sva2021

      Public
      0000Updated Jan 27, 2025Jan 27, 2025
    • MaPeT

      Public
      Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training
      Python
      11620Updated Jan 25, 2025Jan 25, 2025
    • User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
      JavaScript
      MIT License
      11k000Updated Jan 15, 2025Jan 15, 2025
    • pipelines

      Public
      Pipelines: Versatile, UI-Agnostic OpenAI-Compatible Plugin Framework
      Python
      MIT License
      491000Updated Jan 10, 2025Jan 10, 2025
    • Dress Code: High-Resolution Multi-Category Virtual Try-On. ECCV 2022
      Python
      Other
      64550100Updated Dec 12, 2024Dec 12, 2024
    • PASTA

      Public
      HTML
      Apache License 2.0
      0400Updated Dec 5, 2024Dec 5, 2024
    • This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.
      24800Updated Dec 3, 2024Dec 3, 2024
    • mil4wsi

      Public
      DAS-MIL: Distilling Across Scales for MILClassification of Histological WSIs
      Python
      MIT License
      65610Updated Nov 29, 2024Nov 29, 2024
    • Python
      MIT License
      0410Updated Nov 2, 2024Nov 2, 2024
    • JavaScript
      0000Updated Oct 25, 2024Oct 25, 2024
    • pin

      Public
      Jupyter Notebook
      Other
      01100Updated Oct 18, 2024Oct 18, 2024
    • HEaD

      Public
      0000Updated Sep 30, 2024Sep 30, 2024
    • DiCO

      Public
      Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization (BMVC 2024 Oral ✨)
      Python
      01700Updated Sep 11, 2024Sep 11, 2024
    • coldfront

      Public
      HPC Resource Allocation System
      Python
      GNU General Public License v3.0
      89000Updated Sep 7, 2024Sep 7, 2024
    • Alfie

      Public
      Democratising RGBA Image Generation With No $$$ (AI4VA@ECCV24)
      Python
      22500Updated Sep 2, 2024Sep 2, 2024
    83 repositories found. List is sorted by Last pushed in descending order.