-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LSTM: Training - Image not trainable #590
Comments
The images used were created by text2image with training text with word wrap which ran for full width of page. Is there a limit to size of images for training? Should training text only to be 70-120 characters wide? |
This is the opposite case of image being too small.
|
https://github.com/tesseract-ocr/tesseract/blob/ce76d1c569/lstm/lstmrecognizer.cpp#L266
|
Then shouldn't text2image ensure that images are made to fit that width.
ShreeDevi
…____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Mon, Jan 9, 2017 at 3:20 PM, Amit D. ***@***.***> wrote:
Is there a limit to size of images for training?
https://github.com/tesseract-ocr/tesseract/blob/ce76d1c569/
lstm/lstmrecognizer.cpp#L266
https://github.com/tesseract-ocr/tesseract/blob/ce76d1c569/
lstm/lstmrecognizer.cpp#L266
// Maximum width of image to train on.
const int kMaxImageWidth = 2560;
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE2_oyLDWu_QZFaYM9Kn1mCaW7ExTo-_ks5rQgLtgaJpZM4LQsPF>
.
|
Yes :-) |
https://github.com/tesseract-ocr/tesseract/blob/831e161066d28a0320d7061c8403f638515b8801/training/text2image.cpp#L82
// Width of output image (in pixels).
INT_PARAM_FLAG(xsize, 3600, "Width of output image");
|
The default value for images output by text2image can be reduced during running tesstrain.sh by modifying tesstrain_utils.sh
|
Ray, // Maximum width of image to train on. I have some old tif/box pairs . the image width is 4000. Will training quality be degraded if changing above constant to 4000 in order to use them? |
Also can this be changed during runtime with a variable or do I need to recompile tesseract with the higher value? |
Changing tesstrain_utils.sh for
|
@Shreeshrii how can the problem of image being too small be fixed? |
Usually this happens for just a few lines of an image - tesseract splits the input image into separate image per line. It could be when layout analysis has wrongly segmented the page or a line has been detected as having hundreds of diacritics. If it is just a few messages, you could ignore. @theraysmith Any update regarding new line detection algorithm? |
actually, it's not just a few messages. I am trying to train tesseract to
recognize plate licence, and the prepared training_text is just like a
plate licence. something like this:
۵۴ ۷۲۸ ب ۱۴
each line includes one of these patterns.
I received a lot of these errors and the training process finished with
error rate equal to zero. no training!
would you please help me to figure out what the problem is?
…On Wed, Aug 9, 2017 at 8:02 AM, Shreeshrii ***@***.***> wrote:
Reopened #590 <#590>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZFiARloL1SxhhVagWDBpNPsl8wmxGH3ks5sWSgzgaJpZM4LQsPF>
.
|
@hanikh, please paste a short example for the errors you get. |
The exact error message would greatly help diagnose the problem.
…On Tue, Aug 8, 2017 at 10:28 PM, Amit D. ***@***.***> wrote:
Image too large to learn!! Size = 2758x48
Image not trainable
@hanikh <https://github.com/hanikh>, please paste a short example for the
errors you get.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL056TBM3518EXdJE7-KA44mvwgN2Mx2ks5sWUNhgaJpZM4LQsPF>
.
--
Ray.
|
I will send the exact error message as soon as possible. but, meanwhile I
have faced a more important problem. I finetuned tesseract for farsi (40
fonts on 6000 text lines) and I got worse result than the original tesserct
on the trained fonts. what is the problem? the training_text is not big
enough? (this is a different project and not related to the licence plate)
On Thu, Aug 10, 2017 at 11:17 PM, theraysmith <notifications@github.com>
wrote:
… The exact error message would greatly help diagnose the problem.
On Tue, Aug 8, 2017 at 10:28 PM, Amit D. ***@***.***> wrote:
> Image too large to learn!! Size = 2758x48
> Image not trainable
>
> @hanikh <https://github.com/hanikh>, please paste a short example for
the
> errors you get.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#590#
issuecomment-321156352>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AL056TBM3518EXdJE7-
KA44mvwgN2Mx2ks5sWUNhgaJpZM4LQsPF>
> .
>
--
Ray.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZFiAQuqzKOKd8bmnzUcFlsc6bPQth3Oks5sW1AzgaJpZM4LQsPF>
.
|
@hanikh |
@theraysmith would you please help me, how many text line is appropriate? |
I finetuned tesseract for farsi (40 fonts on 6000 text lines)
I think this maybe too much for finetuning.
I noticed that tesstrain.sh is limiting text2image generated images to
just 3 pages - that would be only max 150 lines per font.
With that much input, you can try replace a layer training to see if that
gets you better results.
ShreeDevi
…____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sat, Aug 12, 2017 at 3:27 PM, hanikh ***@***.***> wrote:
@theraysmith <https://github.com/theraysmith> would you please help me,
how many text line is appropriate?
thanks
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE2_o4rV-DPLTiSAqgSTy9dJdA3Oek6iks5sXXcJgaJpZM4LQsPF>
.
|
@hanikh I suggest to wait till Ray updates the langdata and also uploads
the new version of unichar_extractor. Befroe that training for RTL
languages may not be give useful results.
ShreeDevi
…____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sat, Aug 12, 2017 at 4:04 PM, ShreeDevi Kumar <shreeshrii@gmail.com>
wrote:
> I finetuned tesseract for farsi (40 fonts on 6000 text lines)
I think this maybe too much for finetuning.
I noticed that tesstrain.sh is limiting text2image generated images to
just 3 pages - that would be only max 150 lines per font.
With that much input, you can try replace a layer training to see if that
gets you better results.
ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sat, Aug 12, 2017 at 3:27 PM, hanikh ***@***.***> wrote:
> @theraysmith <https://github.com/theraysmith> would you please help me,
> how many text line is appropriate?
> thanks
>
> —
> You are receiving this because you modified the open/close state.
> Reply to this email directly, view it on GitHub
> <#590 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AE2_o4rV-DPLTiSAqgSTy9dJdA3Oek6iks5sXXcJgaJpZM4LQsPF>
> .
>
|
Image too small to scale!! (3x48 vs min width of 3) Finished! Error rate = 0 |
Initial problem: (Image too small to scale)
Those images are ridiculously small at 3x48 pixels. Something is going
wrong somewhere with the images.
Are they oriented vertically? The input scaling scales the height to 48,
whatever it starts as, so it looks like your textlines are vertical.
Fine tuning problem:
The problem is most likely too many iterations. It will hone its accuracy
to whatever training data you give it if you run it for too many iterations.
See how few iterations are used in the training tutorial for fine tuning.
…On Sat, Aug 12, 2017 at 5:19 AM, hanikh ***@***.***> wrote:
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Compute CTC targets failed!
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
Image too small to scale!! (3x48 vs min width of 3)
Line cannot be recognized!!
Image not trainable
2 Percent improvement time=0, best error was 2.167 @ 14
At iteration 14/1100/20884, Mean rms=0.049%, delta=0%, char train=0%, word
train=0%, skip ratio=1798.6%, New best char error = 0 wrote best
model:/home/fanasa/tesstutorial/fastuned_from_fas/fastuned-plates0_14.lstm
wrote checkpoint.
Finished! Error rate = 0
this is the error I got during training for licence plates.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL056ZvLnyg_aC1mUg2gH34puAGpWdOOks5sXZhHgaJpZM4LQsPF>
.
--
Ray.
|
Ray,
I have seen line too small to be recognized when building box/tiff pairs
using tesstrain.sh - it is usually related to 'nnn diacritics found' - so
it may be related to accents being treated as a separate line.
Regarding finetuning, I have experimented a lot with Devanagari - with
smaller number of iterations, the reported error rate is higher. And it
takes tens of thosands of iterations for it to get more accuracy on
training set - not sure of its effect on samples it has not seen. - see
https://github.com/Shreeshrii/tess4training/blob/master/README.md
ShreeDevi
…____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sun, Aug 13, 2017 at 9:44 AM, theraysmith <notifications@github.com>
wrote:
Initial problem: (Image too small to scale)
Those images are ridiculously small at 3x48 pixels. Something is going
wrong somewhere with the images.
Are they oriented vertically? The input scaling scales the height to 48,
whatever it starts as, so it looks like your textlines are vertical.
Fine tuning problem:
The problem is most likely too many iterations. It will hone its accuracy
to whatever training data you give it if you run it for too many
iterations.
See how few iterations are used in the training tutorial for fine tuning.
On Sat, Aug 12, 2017 at 5:19 AM, hanikh ***@***.***> wrote:
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Compute CTC targets failed!
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> Image too small to scale!! (3x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable
> 2 Percent improvement time=0, best error was 2.167 @ 14
> At iteration 14/1100/20884, Mean rms=0.049%, delta=0%, char train=0%,
word
> train=0%, skip ratio=1798.6%, New best char error = 0 wrote best
> model:/home/fanasa/tesstutorial/fastuned_from_
fas/fastuned-plates0_14.lstm
> wrote checkpoint.
>
> Finished! Error rate = 0
> this is the error I got during training for licence plates.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#590#
issuecomment-321977639>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AL056ZvLnyg_
aC1mUg2gH34puAGpWdOOks5sXZhHgaJpZM4LQsPF>
> .
>
--
Ray.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE2_o3ztjvMQKBue5JIqMU9Qrfx4ng_Mks5sXng2gaJpZM4LQsPF>
.
|
for the fine tuning problem:
the error-rate reaches 0.017 at about 80000 iterations. so with few
iterations like in tutorial, a low error-rate like 0.01 can not be
achieved. so you think fine tuning is a wrong solution and I should try
replacing some layers? as I said before I am trying to train for 40 Persian
fonts and they are so common.
On Sun, Aug 13, 2017 at 9:38 AM, Shreeshrii <notifications@github.com>
wrote:
… Ray,
I have seen line too small to be recognized when building box/tiff pairs
using tesstrain.sh - it is usually related to 'nnn diacritics found' - so
it may be related to accents being treated as a separate line.
Regarding finetuning, I have experimented a lot with Devanagari - with
smaller number of iterations, the reported error rate is higher. And it
takes tens of thosands of iterations for it to get more accuracy on
training set - not sure of its effect on samples it has not seen. - see
https://github.com/Shreeshrii/tess4training/blob/master/README.md
ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sun, Aug 13, 2017 at 9:44 AM, theraysmith ***@***.***>
wrote:
> Initial problem: (Image too small to scale)
> Those images are ridiculously small at 3x48 pixels. Something is going
> wrong somewhere with the images.
> Are they oriented vertically? The input scaling scales the height to 48,
> whatever it starts as, so it looks like your textlines are vertical.
>
> Fine tuning problem:
> The problem is most likely too many iterations. It will hone its accuracy
> to whatever training data you give it if you run it for too many
> iterations.
> See how few iterations are used in the training tutorial for fine tuning.
>
> On Sat, Aug 12, 2017 at 5:19 AM, hanikh ***@***.***>
wrote:
>
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Compute CTC targets failed!
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > Image too small to scale!! (3x48 vs min width of 3)
> > Line cannot be recognized!!
> > Image not trainable
> > 2 Percent improvement time=0, best error was 2.167 @ 14
> > At iteration 14/1100/20884, Mean rms=0.049%, delta=0%, char train=0%,
> word
> > train=0%, skip ratio=1798.6%, New best char error = 0 wrote best
> > model:/home/fanasa/tesstutorial/fastuned_from_
> fas/fastuned-plates0_14.lstm
> > wrote checkpoint.
> >
> > Finished! Error rate = 0
> > this is the error I got during training for licence plates.
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <#590#
> issuecomment-321977639>,
> > or mute the thread
> > <https://github.com/notifications/unsubscribe-auth/AL056ZvLnyg_
> aC1mUg2gH34puAGpWdOOks5sXZhHgaJpZM4LQsPF>
> > .
> >
>
>
>
> --
> Ray.
>
> —
> You are receiving this because you modified the open/close state.
> Reply to this email directly, view it on GitHub
> <#590#
issuecomment-322020794>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AE2_
o3ztjvMQKBue5JIqMU9Qrfx4ng_Mks5sXng2gaJpZM4LQsPF>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZFiAZCIts02B7U5JsRtn2DYu86ZBuyhks5sXoTKgaJpZM4LQsPF>
.
|
@Shreeshrii would you please explain about the new traineddata file? where
can the lang.lstm-unicharset file be found ? how can combine_lang_model be
used? thanks
On Mon, Aug 14, 2017 at 11:44 AM, Hanieh Khosravi <hani.khosravi@gmail.com>
wrote:
… for the fine tuning problem:
the error-rate reaches 0.017 at about 80000 iterations. so with few
iterations like in tutorial, a low error-rate like 0.01 can not be
achieved. so you think fine tuning is a wrong solution and I should try
replacing some layers? as I said before I am trying to train for 40 Persian
fonts and they are so common.
On Sun, Aug 13, 2017 at 9:38 AM, Shreeshrii ***@***.***>
wrote:
> Ray,
>
> I have seen line too small to be recognized when building box/tiff pairs
> using tesstrain.sh - it is usually related to 'nnn diacritics found' - so
> it may be related to accents being treated as a separate line.
>
> Regarding finetuning, I have experimented a lot with Devanagari - with
> smaller number of iterations, the reported error rate is higher. And it
> takes tens of thosands of iterations for it to get more accuracy on
> training set - not sure of its effect on samples it has not seen. - see
> https://github.com/Shreeshrii/tess4training/blob/master/README.md
>
>
>
> ShreeDevi
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Sun, Aug 13, 2017 at 9:44 AM, theraysmith ***@***.***>
> wrote:
>
>
> > Initial problem: (Image too small to scale)
> > Those images are ridiculously small at 3x48 pixels. Something is going
> > wrong somewhere with the images.
> > Are they oriented vertically? The input scaling scales the height to 48,
> > whatever it starts as, so it looks like your textlines are vertical.
> >
> > Fine tuning problem:
> > The problem is most likely too many iterations. It will hone its
> accuracy
> > to whatever training data you give it if you run it for too many
> > iterations.
> > See how few iterations are used in the training tutorial for fine
> tuning.
> >
> > On Sat, Aug 12, 2017 at 5:19 AM, hanikh ***@***.***>
> wrote:
> >
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Compute CTC targets failed!
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > Image too small to scale!! (3x48 vs min width of 3)
> > > Line cannot be recognized!!
> > > Image not trainable
> > > 2 Percent improvement time=0, best error was 2.167 @ 14
> > > At iteration 14/1100/20884, Mean rms=0.049%, delta=0%, char train=0%,
> > word
> > > train=0%, skip ratio=1798.6%, New best char error = 0 wrote best
> > > model:/home/fanasa/tesstutorial/fastuned_from_
> > fas/fastuned-plates0_14.lstm
> > > wrote checkpoint.
> > >
> > > Finished! Error rate = 0
> > > this is the error I got during training for licence plates.
> > >
> > > —
> > > You are receiving this because you were mentioned.
> > > Reply to this email directly, view it on GitHub
> > > <#590#
> > issuecomment-321977639>,
> > > or mute the thread
> > > <https://github.com/notifications/unsubscribe-auth/AL056ZvLnyg_
> > aC1mUg2gH34puAGpWdOOks5sXZhHgaJpZM4LQsPF>
> > > .
> > >
> >
> >
> >
> > --
> > Ray.
> >
> > —
> > You are receiving this because you modified the open/close state.
> > Reply to this email directly, view it on GitHub
> > <#590 (comment)
> comment-322020794>,
> > or mute the thread
> > <https://github.com/notifications/unsubscribe-auth/AE2_o3ztj
> vMQKBue5JIqMU9Qrfx4ng_Mks5sXng2gaJpZM4LQsPF>
>
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#590 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AZFiAZCIts02B7U5JsRtn2DYu86ZBuyhks5sXoTKgaJpZM4LQsPF>
> .
>
|
It will create lang.* files , including the unicharset. You can use dawg2wordlist to see the wordlist used
For RTL languages, there is an additional flag. Please see https://github.com/tesseract-ocr/tesseract/blob/master/training/tesstrain_utils.sh for details. I used a hand-edited unicharset, because the unicharset generated from the current training process is |
|
Fixes Image too large to learn!! Size = 2594x48 Image not trainable See tesseract-ocr#590 (comment) for related discussion
#590 (comment) by Ray Smith
This bug is still there. |
with version
|
I had the same problem as the thread OP:
I resolved it with this suggestion above #590 (comment)
Was this the correct approach? |
Image too large to learn!! hasn’t gone. You get it with a small enough font or with 48-pixel-tall input layer even using
So the question is why does this constraint exist and whether it can be dropped or set to, say, 6000? Or should one prepare shorter lines after all? What would be the correct solution? |
Similar to #590 (comment)
tesseract 4.1.0-rc1-255-g332a1 |
Hi Shree, I am also getting same error |
does anyone know what is the recommended image size which bounding boxes are extracted from it to retrain tesseract , if so shall i retrain with fixed sizes or with variety of images sizes |
I'm also experiencing this same error while trying to fine tune an existing model:
As suggested in #590, I modified Also, after sorting the .tif files by dimension in descending order, I noticed that the first three files aren't even that large: In fact, none of my images have a width of 3316 px. Why is tesseract getting these different values for dimensions? |
Please upload a sample lstmf file which is getting the error for checking.
…On Sat, Mar 7, 2020, 01:01 Luan Utimura ***@***.***> wrote:
@Shreeshrii <https://github.com/Shreeshrii>
I'm also experiencing this same error while trying to fine tune an
existing model:
[...]
Loaded 1/1 lines (1-1) of document data/bar-ground-truth/test-0-049.exp0.lstmf
Image too large to learn!! Size = 3316x48
Image not trainable
Loaded 1/1 lines (1-1) of document data/bar-ground-truth/test-1-026.exp0.lstmf
Image too large to learn!! Size = 3316x48
[...]
As suggested in #590
<#590 (comment)>,
I modified tesstrain_utils.sh by changing the X_SIZE variable but it
didn't help.
Also, after sorting the .tif files by dimension in descending order, I
noticed that the first three files aren't even that large:
[image: files]
<https://user-images.githubusercontent.com/10110243/76115228-85bada00-5fc6-11ea-910a-10a2857f90b5.png>
In fact, none of my images have a width of 3316 px.
I tried to resize them w/ ImageMagick but it didn't help as well.
Why is tesseract reading these values for dimensions?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#590?email_source=notifications&email_token=ABG37IZZDVJS3ZPIOFFFOODRGFFSZA5CNFSM4C2CYPC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOCRTAQ#issuecomment-595925378>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABG37I4VFECHBC5BNUEF6V3RGFFSZANCNFSM4C2CYPCQ>
.
|
This one, for example: Thanks for replying. |
@lnutimura Thanks for the lstmf file. I unpacked it using an experimental feature by @stweil to check the image file in it. You are right, the image size is 2351x32. I think the image is being resized for 48 height as part of training and that is increasing its width to 3627 and leading to the error. I had thought that the resized image maybe kept in lstmf file, but that is not the case. Please take a look at the network spec that you are using for training. Usually the image height is either 36 or 48 in them. e.g. from https://tesseract-ocr.github.io/tessdoc/Data-Files-in-tessdata_fast Version string:4.00.00alpha:amh:synth20170629 Version string:4.00.00alpha:Arabic:synth20170629:[1,48,0,1Ct3,3,16Mp3,3Lfys64Lfx96Lrx96Lfx128O1c1] |
From https://tesseract-ocr.github.io/tessdoc/VGSLSpecs.html
|
I'm using the
|
The network spec for tessdata_best is not the same as that for tessdata_fast. I don't think we have the info for all tessdata_best languages. |
Also see #590 (comment) by Ray
|
Oh, I see. That makes sense now. |
Are your original images 2 column, or facing pages of book? If so it will be helpful to split them before generating line images. |
They're tables that occupy the entire space of the page. EDIT: I was able to finish the training without any error! Now it's just a matter of finding ways to improve the fine tuning. |
Hi there, I'm having the same error, with 1 tif, 1000 iterations, however, the lstmtraining keeps running.
I'm training on a single image, just to understand the mechanism, and learn about it. I'm confused a bit by this, as the script still runs, and the error rate keeps dropping.
I could not find details about how to train/fina tune with own tif/box. It's unclear to me if I need to generate the ground-truth data as well, do I still need to fiddle/fix the box files, etc. Sorry if I asked too many questions, I've invested so much time in it, and I'm not sure where exactly to these questions fit - forum, new issue, Google Group? Later edit: |
I am getting: Even though all my images are 1900x17. |
All images are resized to 36 or 48 pixels height based on network spec used. So looks like your resized image maybe too big. |
mkdir -p ~/tesstutorial/sanvedic
lstmtraining -U ~/tesstutorial/vedic/san.unicharset
--script_dir ../langdata --debug_interval 0
--learning_rate 10e-5
--net_spec '[1,0,0,1 Ct5,5,16 Mp3,3 Lfys64 Lfx128 Lrx128 Lfx384 O1c5000]'
--net_mode 192
--perfect_sample_delay 19
--model_output ~/tesstutorial/sanvedic/base
--train_listfile ~/tesstutorial/vedic/san.training_files.txt
--eval_listfile
/tesstutorial/vedic/san.training_files.txt/tesstutorial/sanvedic/basetrain.log--max_iterations 50000
&>
Setting unichar properties
Setting properties for script Common
Setting properties for script Latin
Setting properties for script Devanagari
Unichar 2306=र्त्स्न्ये->र्त्स्न्ये is too long to encode!!
Warning: given outputs 5000 not equal to unicharset of 5018.
Num outputs,weights in serial:
1,0,0,1:1, 0
Num outputs,weights in serial:
C5,5:25, 0
Ft16:16, 416
Total weights = 416
[C5,5Ft16]:16, 416
Mp3,3:16, 0
Lfys64:64, 20736
Lfx128:128, 98816
Lrx128:128, 131584
Lfx384:384, 787968
Fc5018:5018, 1931930
Total weights = 2971450
Built network:[1,0,0,1[C5,5Ft16]Mp3,3Lfys64Lfx128Lrx128Lfx384Fc5018] from request [1,0,0,1 Ct5,5,16 Mp3,3 Lfys64 Lfx128 Lrx128 Lfx384 O1c5000]
Training parameters:
Debug interval = 0, weights = 0.1, learning rate = 0.0001, momentum=0.9
Loaded 828/828 pages (0-828) of document /home/shree/tesstutorial/vedic/san.AA_NAGARI_SHREE_L1.exp0.lstmf
Loaded 691/691 pages (0-691) of document /home/shree/tesstutorial/saneval/san.Aksharyogini2.exp0.lstmf
Loaded 1023/1023 pages (0-1023) of document /home/shree/tesstutorial/vedic/san.Sanskrit_2003.exp0.lstmf
Loaded 957/957 pages (0-957) of document /home/shree/tesstutorial/vedic/san.e-Nagari_OT.exp0.lstmf
Loaded 1060/1060 pages (0-1060) of document /home/shree/tesstutorial/vedic/san.FreeSans.exp0.lstmf
Loaded 691/691 pages (0-691) of document /home/shree/tesstutorial/saneval/san.Amiko.exp0.lstmf
Loaded 1213/1213 pages (0-1213) of document /home/shree/tesstutorial/vedic/san.Siddhanta-cakravat.exp0.lstmf
Loaded 1191/1191 pages (0-1191) of document /home/shree/tesstutorial/vedic/san.Sahadeva.exp0.lstmf
Loaded 1291/1291 pages (0-1291) of document /home/shree/tesstutorial/vedic/san.Santipur_OT_Medium.exp0.lstmf
Loaded 1115/1115 pages (0-1115) of document /home/shree/tesstutorial/vedic/san.Lohit_Devanagari.exp0.lstmf
Loaded 1210/1210 pages (0-1210) of document /home/shree/tesstutorial/vedic/san.Nakula.exp0.lstmf
Found AVX
Found SSE
Loaded 1188/1188 pages (0-1188) of document /home/shree/tesstutorial/vedic/san.Siddhanta-Calcutta.exp0.lstmf
Loaded 1211/1211 pages (0-1211) of document /home/shree/tesstutorial/vedic/san.Siddhanta.exp0.lstmf
Loaded 1214/1214 pages (0-1214) of document /home/shree/tesstutorial/vedic/san.Siddhanta-Nepali.exp0.lstmf
Loaded 1157/1157 pages (0-1157) of document /home/shree/tesstutorial/vedic/san.Uttara.exp0.lstmf
Image too large to learn!! Size = 2594x48
Image not trainable
Image too large to learn!! Size = 2758x48
Image not trainable
Image too large to learn!! Size = 2621x48
Image not trainable
At iteration 100/100/103, Mean rms=0.95%, delta=57.759%, char train=100.161%, word train=100%, skip ratio=3%, New worst char error = 100.161 wrote checkpoint
The text was updated successfully, but these errors were encountered: