Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handwritten recognition #1472

Closed
wants to merge 2 commits into from
Closed

Conversation

MufeezQadri-main
Copy link

Add handwriting recognition support using cloud services - Implement new handwriting recognition plugin using Google Cloud Vision and Azure Form Recognizer - Add support for multiple languages in handwriting recognition - Implement hOCR output generation for recognized handwritten text - Add PDF generation capability for handwritten documents - Include test files and samples for handwriting recognition This feature allows OCRmyPDF to recognize handwritten text in PDF documents using cloud-based OCR services, expanding its capabilities beyond printed text recognition.

Add handwriting recognition support using cloud services  - Implement new handwriting recognition plugin using Google Cloud Vision and Azure Form Recognizer - Add support for multiple languages in handwriting recognition - Implement hOCR output generation for recognized handwritten text - Add PDF generation capability for handwritten documents - Include test files and samples for handwriting recognition  This feature allows OCRmyPDF to recognize handwritten text in PDF documents using cloud-based OCR services, expanding its capabilities beyond printed text recognition.
@jbarlow83
Copy link
Collaborator

I appreciate the proposed contribution but I don't think this quite production ready, since there are obviously still some TODOs (no hocr implementation for one use case, etc) and a lack of documentation or tests. It looks to me some important cases may not be handled, e.g. rotated base lines, which is import for PDF viewers to understand and group related text.

I also think it would be more accurate to simply name it as cloud OCR, since handwriting recognition is just one feature (and tradeoff) of cloud OCR.

My suggestion is to create this as a third party plugin in its own repository, and then when it seems more mature we can discuss merging.

@jbarlow83 jbarlow83 closed this Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants