Skip to content
View zjr2000's full-sized avatar
😢
Focusing
😢
Focusing
  • Southern University of Science and Technology
  • Shen Zhen

Highlights

  • Pro

Block or report zjr2000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
zjr2000/README.md

Hi there 👋

This is Jinrui,

🏫 I'm a graduate student from SUSTech with a bachelor's degree in Computer Science.

🔭 I’m currently working on toward M.S. degree with SUSTech.

🌱 I’m currently focus on vision-language research.

Jinrui's GitHub stats

Pinned Loading

  1. ttengwang/Caption-Anything ttengwang/Caption-Anything Public

    Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/sp…

    Python 1.7k 104

  2. REVERIE REVERIE Public

    [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models

    Python 15

  3. GVL GVL Public

    Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos

    Python 27 6

  4. LLMVA-GEBC LLMVA-GEBC Public

    Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)

    Python 29 2

  5. Context-GEBC Context-GEBC Public

    Second-place solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2022 workshop)

    Python 4 1

  6. Awesome-Multimodal-Chatbot Awesome-Multimodal-Chatbot Public

    Awesome Multimodal Assistant is a curated list of multimodal chatbots/conversational assistants that utilize various modes of interaction, such as text, speech, images, and videos, to provide a sea…

    73 7