multimodal-llm

Star

Here are 14 public repositories matching this topic...

eric-ai-lab / MiniGPT-5

Star

Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"

transformers diffusion-models multimodal-generation multimodal-llm

Updated Dec 12, 2024
Python

alipay / Ant-Multi-Modal-Framework

Star

Research Code for Multimodal-Cognition Team in Ant Group

video-editing multimodal-learning video-text-retrieval image-text-retrieval multimodal-llm

Updated Jul 11, 2024
Python

Zhoues / MineDreamer

Star

[NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control "

minecraft diffusion-model embodied-agent multimodal-llm

Updated Jun 30, 2024
Python

UCSC-VLAA / vllm-safety-benchmark

Star

[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"

benchmark safety datasets robustness adversarial-attacks llm vision-language-model multimodal-llm

Updated Nov 28, 2023
Python

AIDC-AI / Wings

Star

The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]

deep-learning mllm multimodal-large-language-models multimodal-llm text-only-forgetting

Updated Dec 28, 2024
Python

HenryPengZou / ImplicitAVE

Star

[ACL 2024] Dataset and Code of "ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction"

attribute-value-extraction vision-language-model multimodal-llm implicit-attribute-value-extraction

Updated Jun 10, 2024
Jupyter Notebook

shanface33 / GPT4MF_UB

Star

Official repository of the paper: Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics

image-forensics deepfake-detection deepfake-images chatgpt-4 multimodal-llm ai-generated-image-detection

Updated Mar 22, 2024

iamaziz / chat_with_images

Star

Streamlit app to chat with images using Multi-modal LLMs.

streamlit llms llava multimodal-llm

Updated Mar 17, 2024
Python

zhudotexe / kani-vision

Sponsor

Star

Kani extension for supporting vision-language models (VLMs). Comes with model-agnostic support for GPT-Vision and LLaVA.

extension kani large-language-models vision-language-model llava multimodal-llm gpt-vision

Updated Nov 22, 2023
Python

autodistill / autodistill-llava

Star

LLaVA base model for use with Autodistill.

computer-vision llava autodistill multimodal-llm

Updated Jan 24, 2024
Python

aastroza / cachai

Star

The future of AI is speaking Chilean, cachai?

openai chile build-in-public llm multimodal-llm

Updated Jul 20, 2024
Jupyter Notebook

Jiaxuan-Li / NEMO

Star

NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects?

robustness vision-and-languge multimodal-llm

Updated Nov 28, 2024
JavaScript

abdur75648 / MedicalGPT

Star

Medical Report Generation And VQA (Adapting XrayGPT to Any Modality)

medical-imaging vqa llama vqa-dataset medical-dataset vicuna llm medical-report-generation llms chatgpt minigpt4 multimodal-llm medicalgpt chatgpt4o xraygpt

Updated Jun 24, 2024
Python

ChocoWu / SeTok-web

Star

This is the project webpage for 'SeTok'.

multimodal-llm vision-tokenization

Updated Oct 8, 2024
CSS

Improve this page

Add a description, image, and links to the multimodal-llm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-llm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal-llm

Here are 14 public repositories matching this topic...

eric-ai-lab / MiniGPT-5

alipay / Ant-Multi-Modal-Framework

Zhoues / MineDreamer

UCSC-VLAA / vllm-safety-benchmark

AIDC-AI / Wings

HenryPengZou / ImplicitAVE

shanface33 / GPT4MF_UB

iamaziz / chat_with_images

zhudotexe / kani-vision

autodistill / autodistill-llava

aastroza / cachai

Jiaxuan-Li / NEMO

abdur75648 / MedicalGPT

ChocoWu / SeTok-web

Improve this page

Add this topic to your repo