-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image too small to scale!! (2x48 vs min width of 3) - vertical lstm training #3001
Comments
Usually you can simply ignore that message. It is typically caused by very narrow glyphs (like |
Duplicate of #2890. |
@stweil thanks, I do have a lot of small glyphs but I don't see any other output other than these errors. Is the setting controlling the minimum glyph size in the network specification? |
I'm having the same issue as @davidb1 |
I have the same error but when I'm actually trying to do OCR. Tesseract v4.1.1-rc2-25-g9707 and error is |
I investigated this a bit today, since there were reports of successful training of jpn_vert in the forum in 2018. So it seemed to me that maybe something was broken after that. I downloaded the traineddata files from the project - https://github.com/zodiac3539/jpn_vert So, I compared training for same set of inputs, for tesseract-4.0.0-beta3 and tesseract-4.1.0-rc1. There errors are not there for 4.0.0-beta3 but are there in tesseract-4.1.0-rc1, so were introduced in between these. 4.0.0-beta3 log
4.1.0-rc1 log
To those who want to train vertical fonts, please try with tesseract-4.0.0-beta3. It will be great if someone can investigate this further to figure out when the bug got in. |
The original report was for 4.0.0. So can we reduce the commit range which introduced the changed behaviour to somewhere between 4.0.0-beta3 and 4.0.0? Which data and commands did you use to get the log files shown above? |
4.0.0-beta.4 is ok (Jul 30 2018)
4.0.0-rc1 fails (Oct 2, 2018)
2018-09 - commit # 554450c fails (Sep 3, 2018)
So would be some commit from Aug 2018. |
I am using 07YasashisaAntique font provided by a user with tesstrain.sh rendering text in vertical mode. Font needs to be added to vertical fonts in language_specific.sh.
|
Please see https://groups.google.com/g/tesseract-ocr/c/QVTAfLGKiNI/m/QfQxv924CgAJ for the report in forum along with a zipped file of training fonts etc. |
@stweil This bug seems to have been introduced by my commits related to jav_java. I will revert the change, test and report back later today. |
The problem was caused by addition of This led to a change in size of the lstmf file (and what was in it). |
Environment
Tesseract Version: 4.00
Commit Number:
Platform: Ububtu 18.04
Current Behavior:
I am generating vertical lstm training files using tesstrain.sh but when I try to train on them I get (on all the training data):
I couldn't find much on the problem except #590 and I couldn't find a solution there.
Where should I be looking in order to fix this?
Expected Behavior:
Suggested Fix:
The text was updated successfully, but these errors were encountered: