🥥 CocoIndex ETL with Document AI

CocoIndex is an ETL framework to transform data for AI, with real-time incremental processing - keep index up to date with low latency on source update. It supports custom logic like LEGO, and makes it easy for users to plugin the modules that best suits their project.

In this example, we will walk you through how to build embedding index based on local files, using Google Document AI as parser.

🥥 🌴 We are constantly improving - more blogs and examples coming soon. Stay tuned 👀 and drop a star at Cocoindex on Github for latest updates!

Prerequisite

Install Postgres if you don't have one.
Configure Project and Processs ID for Document AI API
- Official Google document AI API
- Sign in to Google Cloud Console, create or open a project, and enable Document AI API.
- Create a processor in Document AI.
update '.env' with GOOGLE_CLOUD_PROJECT_ID and GOOGLE_CLOUD_PROCESSOR_ID.

Run

Install dependencies:

pip install -e .

Setup:

python main.py cocoindex setup

Update index:

python main.py cocoindex update

Run:

python main.py

CocoInsight

CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: Watch on YouTube.

Run CocoInsight to understand your RAG data pipeline:

python main.py cocoindex server -c https://cocoindex.io

Then open the CocoInsight UI at https://cocoindex.io/cocoinsight.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
image		image
pdf_files		pdf_files
.env		.env
.gitignore		.gitignore
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🥥 CocoIndex ETL with Document AI

Prerequisite

Run

CocoInsight

About

Releases

Packages

Languages

cocoindex-io/cocoindex-etl-with-document-ai

Folders and files

Latest commit

History

Repository files navigation

🥥 CocoIndex ETL with Document AI

Prerequisite

Run

CocoInsight

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages