- upload a file
- pass file through OCR [Pytesseract]
- pass file by NLTK summarization function
- create an api for east interface
- set environment variables for image magik and tesseract
-
Notifications
You must be signed in to change notification settings - Fork 1
A Django based web application that takes files in Image, PDF and text formats from users and extracts the textual content of these files using OCR (Optical Character Recognition), summarizes the content of files using NLTK library using Page Rank algorithm and Natural Language Processing
inkfil/OCR-In-Django-And-Tesseract
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
A Django based web application that takes files in Image, PDF and text formats from users and extracts the textual content of these files using OCR (Optical Character Recognition), summarizes the content of files using NLTK library using Page Rank algorithm and Natural Language Processing
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published