This is the official repository for the paper "Fine-Grained Semantically Aligned Vision-Language Pre-Training" (NeurIPS 2022).
We are pleased to announce that our multi-modal large language model is now open-sourced.
Empowering Vision-Language Models to Follow Interleaved Vision-Language Instructions.