How to use phi3 vision through vllm in langchain for extracting image text data #24071

tusharraskar · 2024-07-10T14:39:03Z

tusharraskar
Jul 10, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

from langchain_community.llms import VLLMOpenAI

llm = VLLMOpenAI(
    openai_api_key="EMPTY",
    openai_api_base=api_base,
    model_name="microsoft/Phi-3-vision-128k-instruct",
    model_kwargs={"stop": ["."]},
)

Description

I want pass image to phi3 vision and extract text data how can do ?

System Info

Python 3.10

@dosu · 2024-07-10T14:44:45Z

dosubot[bot]
bot Jul 10, 2024

To pass an image to the Phi-3 Vision model through VLLM in LangChain to extract text data, you can use the VertexAIImageCaptioning model. Here is an example of how to do it:

from langchain_google_vertexai import VertexAIImageCaptioning
import requests
import base64
import io
from PIL import Image

# URL of the image you want to process
image_url = "URL_OF_YOUR_IMAGE"
image_content = requests.get(image_url).content

# Convert image content to base64
img_base64 = base64.b64encode(image_content).decode('utf-8')
img_base64_str = f"data:image/jpeg;base64,{img_base64}"

# Initialize the Image Captioning Object
model = VertexAIImageCaptioning()

# Invoke the model with the base64 image string
response = model.invoke(img_base64_str)
print(f"Extracted Text: {response}")

# Convert base64 string back to Image for display (optional)
img = Image.open(io.BytesIO(base64.decodebytes(bytes(img_base64, "utf-8"))))
img.show()

Replace "URL_OF_YOUR_IMAGE" with the actual URL of the image you want to process. This code will send the image to the model and request it to extract text data from it ^[1]^[2]^[3]^[4].

To continue talking to Dosu, mention @dosu.

2 replies

tusharraskar Jul 10, 2024
Author

@dosu
I didn't want use vertex AI want to use VLLMOpenAI

dosubot[bot] bot Jul 10, 2024

To pass an image to the Phi-3 Vision model using VLLMOpenAI to extract text data, you can use the following code snippet:

! pip install "openai>=1" "langchain>=0.0.331rc2" matplotlib pillow

import base64
import io
from PIL import Image
from IPython.display import HTML, display
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI

def encode_image(image_path):
    """Getting the base64 string"""
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

def plt_img_base64(img_base64):
    """Display the base64 image"""
    image_html = f'<img src="data:image/jpeg;base64,{img_base64}" />'
    display(HTML(image_html))

# Path to your image
path = "image_example.jpg"
img_base64 = encode_image(path)
plt_img_base64(img_base64)

# Initialize the model
chat = ChatOpenAI(model="gpt-4-vision-preview", max_tokens=1024)

# Prepare input for model consumption
message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": "Extract the text data from this image.",
        },
        {
            "type": "image_url",
            "image_url": {"url": f"data:image/jpeg;base64,{img_base64}"},
        },
    ]
)

# Invoke a model response
output = chat.invoke([message])
print(output.content)

This code initializes the gpt-4-vision-preview model, encodes the image to base64, prepares the image and text messages, and invokes the model to extract text data from the image ^[1]^[2]^[3]^[4].

To continue talking to Dosu, mention @dosu.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use phi3 vision through vllm in langchain for extracting image text data #24071

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

How to use phi3 vision through vllm in langchain for extracting image text data #24071

tusharraskar Jul 10, 2024

Checked other resources

Commit to Help

Example Code

Description

System Info

Replies: 1 comment · 2 replies

dosubot[bot] bot Jul 10, 2024

tusharraskar Jul 10, 2024 Author

dosubot[bot] bot Jul 10, 2024

tusharraskar
Jul 10, 2024

Replies: 1 comment 2 replies

dosubot[bot]
bot Jul 10, 2024

tusharraskar Jul 10, 2024
Author