-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TessBaseAPIProcessPages only processes information for the last page #1138
Comments
Please report this issue in the repo of the python binding. |
...and the latest tesseract stable version is 3.05.01 |
@zdenop this happens in 3.05.01 as well. I'm wondering whether calling @amitdo I did not see python binding repo under tesseract. Can you please link me to the one you are referring to? Please note i'm using |
I thought that you are using a 3rd party python binding. |
The command line uses the C++ API. |
@amitdo Sorry for the confusion, but I am not. I am referring to the actual tesseract capi. The capi can be called from python as shown in this example tesseract/contrib/tesseract-c_api-demo.py Line 72 in a75ab45
However, the issue still remains of how to call capi to get all the text from multi page TIFF instead of text from only the last page |
We do not provide support for 3rd party sw (e.g. python) => you need to be able replicate problem with C++ or C. |
This method returns text for one page only. The command line tool does not directly call this method. You'll have to look in api/tesseractmain.cpp and mimic it to get things right. Anyway, it's not an issue (bug) in tesseract command line or API. |
Thanks. I've posted this question on the user-forums here: https://groups.google.com/forum/#!topic/tesseract-ocr/AL9LzrHa97k I will continue to dig into |
There is no direct way to get the text in all pages with ProcessPages. if you give it a pointer to TessResultRenderer, the text is written to a file or stdout. |
Before you submit an issue, please review the guidelines for this repository.
Please report an issue only for a BUG, not for asking questions.
Note that it will be much easier for us to fix the issue if a test case that
reproduces the problem is provided. Ideally this test case should not have any
external dependencies. Provide a copy of the image or link to files for the test case.
Please delete this text and fill in the template below.
Environment
Current Behavior:
When processing a multi page TIFF with Tesseract API in python, the text returned is only for the LAST page rather than for ALL pages.
Code used:
Expected Behavior:
Instead of text returned only for the last page, the text should be returned for all pages.
Suggested Fix:
Interestingly this works when using the command line
tesseract
command. So perhaps there is already a fix for the command line.The text was updated successfully, but these errors were encountered: