MyColPali

This application utilizes the ColPali vision language model and OpenAI capabilities to implement various document processing features.

Introduction to ColPali

ColPali is a groundbreaking document retrieval model that utilizes Vision Language Models (VLM).
arxiv link ColPali: Efficient Document Retrieval with Vision Language Models

ColPali Features

Efficient document indexing using Vision Language Models.
Capable of handling various document types, including text, tables, and images.
No need for processes such as OCR, Layout Parsing, Chunking, Captioning, or text embedding models.
Relatively fast response processing compared to other RAG systems.

Updates

Added ColQwen2 model
Need to update the dependency library by running following command

pip install -r requirements.txt

Prerequisites

Before you begin, ensure you have met the following requirements:

Python:

Make sure you have Python 3.10 or later installed. You can download it from the official Python website.

  python --version

pip:

Ensure you have pip installed, which is the package installer for Python.
Git:

Ensure you have Git installed for version control. You can download it from the official Git website.
Virtual Environment:

It is recommended to use a virtual environment to manage your project dependencies.

You can create a virtual environment using venv:

  python -m venv venv

  source venv/bin/activate  # On Windows use `venv\Scripts\activate`

IDE/Code Editor:

Use an IDE or code editor of your choice. Popular options include PyCharm, VSCode, and Eclipse.
PlantUML:

PlantUML is used for generating UML diagrams.

Download PlantUML from the official PlantUML website or PyCharm plugin, Xcode extension.

Quick Install

Clone repository

git clone https://github.com/hyun-yang/MyColPali

With pip:

pip install -r requirements.txt

Or virtual environment(venv), use this command

python -m pip install -r requirements.txt

Run main.py

python main.py

Configure API Key
- Open 'Setting' menu and set API key.
Re-run main.py

python main.py

ColPali/ColQwen2 Model Download

Make sure to download the ColPali/ColQwen2 model prior to using the application.

The total file size to be downloaded over 5GB(ColPali), 8GB(ColQwen2). Depending on your current network speed, this may take some time.

Choose one of the two methods below to download:

Use the download tool from Hugging Face vidore/colpali-v1.2 to download.
Use the download tool from Hugging Face vidore/colqwen2-v0.1 to download.
Open the Jupyter notebook file download_model/download_colpali_model.ipynb and run it.

PyTorch Installation

To utilize the GPU, you need to install a version of PyTorch that is compatible with your operating system and the CUDA version supported by your GPU.

If the PyTorch version is not installed correctly or if you do not have a GPU, it will operate in CPU mode, which is slower.

Please refer to the Utility.get_torch_device method in the util folder for more information.

Screenshots

First Run

Setting

UML Diagram

Main Class Diagram

Vision Presenter / ColPaliVLMModel Diagram

ColPali Question/Answer Test

This is the system information used for this test.

OS : Windows 11
CPU : Ryzen 7 7800X3D
RAM : 64GB DDR5 Corsair 6000MT/s
GPU : Nvidia GeForce RTX 4070 Ti Super - 16GB VRAM
CUDA : 12.1

1) ColPali Efficient Document Retrieval with Vision Language Models Question/Answer

The document referenced in the question/answer below is the ColPali: Efficient Document Retrieval with Vision Language Models.

This document is 20 pages long and includes text, graphs, and images.

File indexing time : 17 seconds
Total pages : 20 pages
File size : 8.9 mb

English Questions

Summarize this document.
What is the purpose of the ViDoRe benchmark?
Why is the ColPali model superior to existing document retrieval systems?
What is the importance of visual cues in document retrieval systems?
How is the training dataset for the ColPali model composed?
How does the late interaction mechanism of the ColPali model work?
What evaluation metrics does the ViDoRe benchmark use?
What comparative models were used to evaluate the performance of the ColPali model?
How has the indexing speed of the ColPali model been improved?
What methods are used to reduce the memory usage of the ColPali model?

When answering the question "What evaluation metrics does the ViDoRe benchmark use?", please note that the quality of response differs when answering to using 5 images versus 10 images.

Korean Questions

이 문서를 요약해주세요.
ViDoRe 벤치마크의 목적은 무엇인가요?
ColPali 모델이 기존 문서 검색 시스템보다 우수한 이유는 무엇인가요?
문서 검색 시스템에서 시각적 단서의 중요성은 무엇인가요?
ColPali 모델의 학습 데이터셋은 어떻게 구성되었나요?
ColPali 모델의 늦은 상호작용 메커니즘은 어떻게 작동하나요?
ViDoRe 벤치마크는 어떤 평가 메트릭을 사용하나요?
ColPali 모델의 성능을 평가하기 위해 어떤 비교 모델이 사용되었나요?
ColPali 모델의 인덱싱 속도는 어떻게 개선되었나요?
ColPali 모델의 메모리 사용량을 줄이기 위한 방법은 무엇인가요?

Q/A Result

Summarize this document.

What is the purpose of the ViDoRe benchmark?

Why is the ColPali model superior to existing document retrieval systems?

What is the importance of visual cues in document retrieval systems?

What evaluation metrics does the ViDoRe benchmark use?

Using 5 images
Using 10 images

What methods are used to reduce the memory usage of the ColPali model?

How has the indexing speed of the ColPali model been improved?

What comparative models were used to evaluate the performance of the ColPali model?

ColPali 모델이 기존 문서 검색 시스템보다 우수한 이유는 무엇인가요?

문서 검색 시스템에서 시각적 단서의 중요성은 무엇인가요?

ColPali 모델의 학습 데이터셋은 어떻게 구성되었나요?

2) Data and AI Trends Report 2024 Question/Answer

The document referenced in the question/answer below is the Data and AI Trends Report 2024.

This report is 44 pages long and includes text, graphs, and images.

File indexing time : 242 seconds
Total pages : 44 pages
File size : 23.7 mb

English Questions

Explain the Top 5 trends.
What is RAG, and how can it be utilized?
Explain why we should learn AI.

Korean Questions

우리가 AI를 배워야 하는 이유를 설명해줘.
AI를 사용해서 데이터 통합을 하려고 하는 기업의 비율은 얼마나 될까?
RAG와 같은 AI 모델을 활용한 기술을 사용하여 데이터베이스 관리에 사용하고 싶은 기업의 비율은 얼마나 될까?
RAG가 어떤 기술이고 어떻게 활용할 수 있어?

Q/A Result

Explain Top 5 trends.

What is RAG, and how can it be utilized?

Explain why we should learn AI.

우리가 AI를 배워야 하는 이유를 설명해줘.

AI를 사용해서 데이터 통합을 하려고 하는 기업의 비율은 얼마나 될까?
RAG와 같은 AI 모델을 활용한 기술을 사용하여 데이터베이스 관리에 사용하고 싶은 기업의 비율은 얼마나 될까?

RAG가 어떤 기술이고 어떻게 활용할 수 있어?

Question/Answer List

Important Notes

When selecting a size in the Image Size settings, the app adjusts the size of the returned images from ColPali, according to the selected image size (the longer side of width/height).
If the Image Size checkbox is selected, the app uses returned images from ColPali, without resizing it.
The larger the image size, the more tokens will be used.

License

Distributed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
custom		custom
download_model		download_model
ico		ico
splash		splash
uml		uml
util		util
vision		vision
.gitignore		.gitignore
LICENSE		LICENSE
README-KR.md		README-KR.md
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MyColPali

Introduction to ColPali

ColPali Features

Updates

Prerequisites

Quick Install

ColPali/ColQwen2 Model Download

PyTorch Installation

Screenshots

UML Diagram

ColPali Question/Answer Test

1) ColPali Efficient Document Retrieval with Vision Language Models Question/Answer

English Questions

Korean Questions

Q/A Result

2) Data and AI Trends Report 2024 Question/Answer

English Questions

Korean Questions

Q/A Result

Important Notes

License

About

Releases

Packages

Languages

License

hyun-yang/MyColPali

Folders and files

Latest commit

History

Repository files navigation

MyColPali

Introduction to ColPali

ColPali Features

Updates

Prerequisites

Quick Install

ColPali/ColQwen2 Model Download

PyTorch Installation

Screenshots

UML Diagram

ColPali Question/Answer Test

1) ColPali Efficient Document Retrieval with Vision Language Models Question/Answer

English Questions

Korean Questions

Q/A Result

2) Data and AI Trends Report 2024 Question/Answer

English Questions

Korean Questions

Q/A Result

Important Notes

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages