A Unified Toolkit for Deep Learning Based Document Image Analysis
-
Updated
Aug 15, 2024 - Python
A Unified Toolkit for Deep Learning Based Document Image Analysis
PdfDet aims to simplify PDF layout detect tasks for users.
A lightweight Python library for metadata-rich document chunking in Retrieval-Augmented Generation (RAG) workflows. It leverages Azure AI Document Intelligence to enhance chunking by retaining hierarchical structure, page numbers, and bounding boxes for seamless integration with PDF viewers.
Add a description, image, and links to the layout-parser topic page so that developers can more easily learn about it.
To associate your repository with the layout-parser topic, visit your repo's landing page and select "manage topics."