How to use phi3 vision through vllm in langchain for extracting image text data #24071
Unanswered
tusharraskar
asked this question in
Q&A
Replies: 1 comment 2 replies
-
To pass an image to the Phi-3 Vision model through VLLM in LangChain to extract text data, you can use the from langchain_google_vertexai import VertexAIImageCaptioning
import requests
import base64
import io
from PIL import Image
# URL of the image you want to process
image_url = "URL_OF_YOUR_IMAGE"
image_content = requests.get(image_url).content
# Convert image content to base64
img_base64 = base64.b64encode(image_content).decode('utf-8')
img_base64_str = f"data:image/jpeg;base64,{img_base64}"
# Initialize the Image Captioning Object
model = VertexAIImageCaptioning()
# Invoke the model with the base64 image string
response = model.invoke(img_base64_str)
print(f"Extracted Text: {response}")
# Convert base64 string back to Image for display (optional)
img = Image.open(io.BytesIO(base64.decodebytes(bytes(img_base64, "utf-8"))))
img.show() Replace |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked other resources
Commit to Help
Example Code
Description
I want pass image to phi3 vision and extract text data how can do ?
System Info
Python 3.10
Beta Was this translation helpful? Give feedback.
All reactions