Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor #27

Merged
merged 5 commits into from
Jan 5, 2016
Merged

Monitor #27

merged 5 commits into from
Jan 5, 2016

Conversation

jimregan
Copy link
Contributor

Zdenko writes:

I found one older patch[1] comming from interesting androidapp Text Fairy (OCR Text Scanner) [2], [3]. I put it to separate branch (monitor) and spitted original patch to 3 commits for testing/cherry picking:
adds monitor/ETEXT_DESC to GetHOCRText
extends ETEXT_DESC ETEXT_DESC with PROGRESS_FUNC field and changed the percentage progress values to start with 0% instead of 30%.
extends hocr output by row attribute I skipped part where patch hard-code font (size?) to 15...

[1] https://www.mail-archive.com/tesseract-ocr@googlegroups.com/msg08089.html
[2] https://github.com/renard314/textfairy
[3] https://play.google.com/store/apps/details?id=com.renard.ocr

@jimregan
Copy link
Contributor Author

n-way history is in #26

@@ -587,6 +587,16 @@ class TESS_API TessBaseAPI {
* Make a HTML-formatted string with hOCR markup from the internal
* data structures.
* page_number is 0-based but will appear in the output as 1-based.
* monitor can be used to
* cancel the regocnition
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 'recognition'

@theraysmith
Copy link
Contributor

I like the spirit of the changes, except that I think it would be better to add some progress code to layout analysis than to start the progress at 0 for word recognition.
The time taken by layout analysis is heavily dependent on the PageSegMode, so it would be better if it (Tesseract::SegmentPage/Tesseract::AutoPageSeg/ColumnFinder::FindBlocks) could call the callbacks and return a number that becomes the base percentage to the recognition phase instead of using either an arbitrary 30, 0 or even worse, making it depend on the existence of a callback function.
That is a bigger change of course.

@zdenop
Copy link
Contributor

zdenop commented May 19, 2015

If there is no space to implement progress monitor for layout analysis for 3.04 release I would suggest to merge this change with explanation that this is only progress monitor for word recognition and it does not cover layout analysis...

zdenop added a commit that referenced this pull request Jan 5, 2016
@zdenop zdenop merged commit c53add7 into master Jan 5, 2016
@zdenop zdenop deleted the monitor branch January 5, 2016 15:29
zvezdochiot pushed a commit to ImageProcessing-ElectronicPublications/tesseract that referenced this pull request Mar 28, 2021
zvezdochiot pushed a commit to ImageProcessing-ElectronicPublications/tesseract that referenced this pull request Mar 28, 2021
zvezdochiot pushed a commit to ImageProcessing-ElectronicPublications/tesseract that referenced this pull request Mar 28, 2021
zvezdochiot pushed a commit to ImageProcessing-ElectronicPublications/tesseract that referenced this pull request Mar 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants