Introduction of Nomic-Embed-Vision for Multimodal Tasks #5687
eduardolundgren
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked
Feature request
Hi there,
I'm a frequent user of the Nomic-Embed-Text library, and I've heard about the just released Nomic-Embed-Vision feature. I'm really excited about it and wanted to check in on its progress.
From what I've read, Nomic-Embed-Vision will offer:
A high-quality, unified embedding space for image, text, and multimodal tasks.
Performance that surpasses both OpenAI CLIP and text-embedding-3-small.
Open weights and code, which is fantastic for indie hacking, research, and experimentation.
https://x.com/nomic_ai/status/1798368463292973361
https://huggingface.co/nomic-ai/nomic-embed-vision-v1.5
When Nomic-Embed-Vision might be available in LangChain JavaScript? Any updates on the timeline or development status would be really appreciated.
The Python library added support to it here.
Thanks for your hard work on this library! I'm looking forward to seeing what Nomic-Embed-Vision can do.
Motivation
Nomic-Embed-Vision is important because it provides a versatile and high-performing embedding model that can handle both image and text data effectively. This unified solution will enhance performance and flexibility in multimodal tasks, which are becoming increasingly common in various applications.
Proposal (If applicable)
No response
Beta Was this translation helpful? Give feedback.
All reactions