Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
chat.py	chat.py
omni.py	omni.py

MiniCPM-o-2_6

In this directory, you will find examples on how you could apply IPEX-LLM INT4 optimizations on MiniCPM-o-2_6 model on Intel GPUs. For illustration purposes, we utilize openbmb/MiniCPM-o-2_6 as reference MiniCPM-o-2_6 model.

In the following examples, we will guide you to apply IPEX-LLM optimizations on MiniCPM-o-2_6 model for text/audio/image/video inputs.

0. Requirements & Installation

To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requirements for your machine, please refer to here for more information.

0.1 Install IPEX-LLM

For Intel Core™ Ultra Processors (Series 2) with processor number 2xxV (code name Lunar Lake) on Windows:

conda create -n llm python=3.11 libuv
conda activate llm

:: or --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/cn/
pip install --pre --upgrade ipex-llm[xpu_lnl] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/us/
pip install torchaudio==2.3.1.post0 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/us/

For Intel Arc B-Series GPU (code name Battlemage) on Linux:

conda create -n llm python=3.11
conda activate llm

# or --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
pip install --pre --upgrade ipex-llm[xpu-arc] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install torchaudio==2.3.1+cxx11.abi --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

Note

We will update for installation on more Intel GPU platforms.

0.2 Install Required Pacakges for MiniCPM-o-2_6

conda activate llm

# refer to: https://huggingface.co/openbmb/MiniCPM-o-2_6#usage
pip install transformers==4.44.2 trl
pip install librosa==0.9.0
pip install soundfile==0.12.1
pip install moviepy

0.3 Runtime Configuration

For Intel Core™ Ultra Processors (Series 2) with processor number 2xxV (code name Lunar Lake) on Windows:
```
set SYCL_CACHE_PERSISTENT=1
```
For Intel Arc B-Series GPU (code name Battlemage) on Linux:
```
unset OCL_ICD_VENDOR
export SYCL_CACHE_PERSISTENT=1
```

Note

We will update for runtime configuration on more Intel GPU platforms.

1. Example: Chat in Omni Mode

In omni.py, we show a use case for a MiniCPM-V-2_6 model to chat in omni mode with IPEX-LLM INT4 optimizations on Intel GPUs. In this example, the model will take a video as input, and conduct inference based on the images and audio of this video.

For example, the video input shows a clip of an athlete swimming, with background audio asking "What the athlete is doing?". Then the model in omni mode should inference based on the images of the video and the question in audio.

1.1 Running example

python omni.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --video-path VIDEO_PATH

Arguments info:

--repo-id-or-model-path REPO_ID_OR_MODEL_PATH: argument defining the huggingface repo id for MiniCPM-o-2_6 model (e.g. openbmb/MiniCPM-o-2_6) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be 'openbmb/MiniCPM-o-2_6'.
--video-path VIDEO_PATH: argument defining the video input.
--n-predict N_PREDICT: argument defining the max number of tokens to predict. It is default to be 32.

Note

In Omni mode, please make sure that the video input contains sound.

Tip

You could just ignore the warning regarding Some weights of the model checkpoint at xxx were not used when initializing MiniCPMO.

2. Example: Chat with text/audio/image input

In chat.py, we show a use case for a MiniCPM-V-2_6 model to chat based on text/audio/image, or a combination of two of them, with IPEX-LLM INT4 optimizations on Intel GPUs.

2.1 Running example

Chat with text input

python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT

Chat with audio input

python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --audio-path AUDIO_PATH

Chat with image input

python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --image-path IMAGE_PATH

Chat with text + audio inputs

python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --audio-path AUDIO_PATH

Chat with text + image inputs

python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --image-path IMAGE_PATH

Chat with audio + image inputs

python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --audio-path AUDIO_PATH --image-path IMAGE_PATH

Arguments info:

--repo-id-or-model-path REPO_ID_OR_MODEL_PATH: argument defining the huggingface repo id for MiniCPM-o-2_6 model (e.g. openbmb/MiniCPM-o-2_6) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be 'openbmb/MiniCPM-o-2_6'.
--prompt PROMPT: argument defining the text input.
--audio-path AUDIO_PATH: argument defining the audio input.
--image-path IMAGE_PATH: argument defining the image input.
--n-predict N_PREDICT: argument defining the max number of tokens to predict. It is default to be 32.

Tip

You could just ignore the warning regarding Some weights of the model checkpoint at xxx were not used when initializing MiniCPMO.

2.2 Sample Outputs

openbmb/MiniCPM-o-2_6

The sample input image is (which is fetched from COCO dataset):

http://farm6.staticflickr.com/5268/5602445367_3504763978_z.jpg

And the sample audio is a person saying "What is in this image".

Chat with text + image inputs

Inference time: xxxx s
-------------------- Input Image Path --------------------
5602445367_3504763978_z.jpg
-------------------- Input Audio Path --------------------
None
-------------------- Input Prompt --------------------
What is in this image?
-------------------- Chat Output --------------------
The image features a young child holding and displaying her white teddy bear. She is wearing a pink dress, which complements the color of the stuffed toy she

Chat with audio + image inputs:

Inference time: xxxx s
-------------------- Input Image Path --------------------
5602445367_3504763978_z.jpg
-------------------- Input Audio Path --------------------
test_audio.wav
-------------------- Input Prompt --------------------
None
-------------------- Chat Output --------------------
In this image, there is a young girl holding and displaying her stuffed teddy bear. She appears to be the main subject of the photo, with her toy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MiniCPM-o-2_6

MiniCPM-o-2_6

README.md

MiniCPM-o-2_6

0. Requirements & Installation

0.1 Install IPEX-LLM

0.2 Install Required Pacakges for MiniCPM-o-2_6

0.3 Runtime Configuration

1. Example: Chat in Omni Mode

1.1 Running example

2. Example: Chat with text/audio/image input

2.1 Running example

2.2 Sample Outputs

openbmb/MiniCPM-o-2_6

Files

MiniCPM-o-2_6

Directory actions

More options

Directory actions

More options

Latest commit

History

MiniCPM-o-2_6

Folders and files

parent directory

README.md

MiniCPM-o-2_6

0. Requirements & Installation

0.1 Install IPEX-LLM

0.2 Install Required Pacakges for MiniCPM-o-2_6

0.3 Runtime Configuration

1. Example: Chat in Omni Mode

1.1 Running example

2. Example: Chat with text/audio/image input

2.1 Running example

2.2 Sample Outputs

openbmb/MiniCPM-o-2_6