Adding more languages for recognition models by default #563

felixdittrich92 · 2021-10-30T08:42:47Z

🚀 The feature

Currently you support only french by default would be great to add more languages directly to choose for example:
model = ocr_predictor(det_arch='db_resnet50', reco_arch='crnn_vgg16_bn', pretrained=True, language='en')
or
reco_arch = crnn_mobilenet_v3_large(pretrained=True, language='de')
What do you think ?

Motivation, pitch

In most cases you need the recognition for a specific language this can be done by training yourself but it would be much easier if some often used languages can be used without own training

Alternatives

Some other ideas:

auto detect language and choose the right model if provided
multilingual model
or provide to choose multiple models like languages=['en', 'fr', 'de']
Adding ViTSTR #513 (i think i will finish this at the end of the year on my side which will than provide de, en, es, fr in one model - but currently no benchmarks)

Additional context

If you want i can train all current existing models in pytorch for english and german

The text was updated successfully, but these errors were encountered:

fg-mindee · 2021-10-30T11:08:39Z

Hi @felixdittrich92,

Just to be clear, we support the French "vocab", not language. The library for now does not include any semantic understanding. For this reason, we took the French vocab as the number of accented characters included in it usually include most european characters.

Typically, there is no character in the English vocab that are not included in our "French" vocab.

Now, to switch to other vocabs, we will have to wait to stabilize first on this vocab. But the text recognition part will select the appropriate vocab & checkpoint depending the wish of the user. Here are a few options:

using a text recognition model with a vocab that includes all characters of all languages. It makes it universal, but it's certainly harder to train, may perform worse on common vocabs, and it will be slower/heavier.
carefully designing the available vocabs, proposing checkpoints for popular vocabs and let users submit theirs for alternative vocabs

The second option looks much better to me I would argue 😅 If the text recognition model encounters characters it has never seen, it will yield very low confidence which can easily be processed accordingly.

In any case, this won't be handled before the 0.6.0 release or later as rotation & handwritten text are more critical to handle for now! (in the meantime, we will draft a process for contributors to submit their trained model on a given vocab though)!

I hope that answers your question :)

felixdittrich92 · 2021-11-02T18:06:47Z

@fg-mindee
closed with #576

fg-mindee · 2021-11-02T19:44:42Z

I may be mixing things but I don't feel like #576 is related to this issue? 🤔

felixdittrich92 · 2021-11-03T07:03:06Z

@fg-mindee
mh yes after rethinking it is better to hold both issues the idea was more like this:
add pretrained weights for all existing models before take care of different vocabs but you are right thats two different problems

felixdittrich92 · 2022-04-28T21:15:23Z

@frgfm I think the huggingface integration (sharing models) is enough and it is maybe better to improve the word generator instead of this issue/idea wdyt ?

frgfm · 2022-05-07T13:04:08Z

Fair point, but I would argue it's different topics:

the word generator is currently generating random character strings
it could be used to generate very specific words
being able to generate it, and having someone train a model in dozens of languages is tricky

To close this issue, I think we should decide whether it's a feature design issue (how multiple vocab models should be accessed as pretrained models) or a wider question. Now that we can use HF hub models, I think we'll only have to change the model name to switch to another language. So if that's not about the design part, I'd argue this has been addressed :)

felixdittrich92 · 2022-05-09T06:55:14Z

@frgfm
I think the point would be to provide some data (with the word generator - make it more robust and useful) that users are able to train on different vocabs and provide there models about the HF hub (so we could do anything like a pinned community call to grab a model provide some information what and how to do and in the end at the models to the list which was added with #896 - if we are stable with some others decisions like sar/master onnx 😅 ).
But generally it is enough if you can grab a trained model from the list, because I think, as you said, that it is not possible to cover every use case, which is why I would rather row back with the proposal language='xyz'.
In my opinion we can close this and maybe open another one to iterate on the word generator.
Wdyt ? :)

frgfm · 2022-05-16T22:23:30Z

I agree :)

But we could easily provide some HF Hub contribution guidelines for language (i.e. how you should name your model so that people can use it). "mindee/crnn_vgg16_bn_french" could easily be on the hub for instance

felixdittrich92 added the type: enhancement Improvement label Oct 30, 2021

fg-mindee self-assigned this Oct 30, 2021

fg-mindee added the topic: text recognition Related to the task of text recognition label Oct 30, 2021

felixdittrich92 closed this as completed Nov 2, 2021

felixdittrich92 reopened this Nov 3, 2021

fg-mindee added this to the 0.6.0 milestone Dec 10, 2021

fg-mindee added the module: models Related to doctr.models label Dec 10, 2021

felixdittrich92 mentioned this issue May 18, 2022

[docs] Add naming conventions for upload models to hf hub #921

Merged

felixdittrich92 closed this as completed May 20, 2022

frgfm mentioned this issue Jun 28, 2022

Release tracker - v0.6.0 #791

Closed

85 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding more languages for recognition models by default #563

Adding more languages for recognition models by default #563

felixdittrich92 commented Oct 30, 2021

fg-mindee commented Oct 30, 2021

felixdittrich92 commented Nov 2, 2021

fg-mindee commented Nov 2, 2021

felixdittrich92 commented Nov 3, 2021

felixdittrich92 commented Apr 28, 2022

frgfm commented May 7, 2022

felixdittrich92 commented May 9, 2022

frgfm commented May 16, 2022

Adding more languages for recognition models by default #563

Adding more languages for recognition models by default #563

Comments

felixdittrich92 commented Oct 30, 2021

🚀 The feature

Motivation, pitch

Alternatives

Additional context

fg-mindee commented Oct 30, 2021

felixdittrich92 commented Nov 2, 2021

fg-mindee commented Nov 2, 2021

felixdittrich92 commented Nov 3, 2021

felixdittrich92 commented Apr 28, 2022

frgfm commented May 7, 2022

felixdittrich92 commented May 9, 2022

frgfm commented May 16, 2022