inkfil / OCR-In-Django-And-Tesseract Public

Notifications You must be signed in to change notification settings
Fork 1
Star 1

A Django based web application that takes files in Image, PDF and text formats from users and extracts the textual content of these files using OCR (Optical Character Recognition), summarizes the content of files using NLTK library using Page Rank algorithm and Natural Language Processing

1 star 1 fork Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
ocr		ocr
readme.md		readme.md
requirements.txt		requirements.txt

Repository files navigation

OCRInDjangoAndTesseract

To Do list

upload a file
pass file through OCR [Pytesseract]
pass file by NLTK summarization function
create an api for east interface
set environment variables for image magik and tesseract

About

A Django based web application that takes files in Image, PDF and text formats from users and extracts the textual content of these files using OCR (Optical Character Recognition), summarizes the content of files using NLTK library using Page Rank algorithm and Natural Language Processing

django ocr django-framework nltk tesseract-ocr pytesseract nltk-python

Report repository

Releases

No releases published

Packages

No packages published

Languages