Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upcoming support for new model architectures #1007

Closed
2 of 3 tasks
Tracked by #1074
frgfm opened this issue Aug 1, 2022 · 3 comments · Fixed by #1443
Closed
2 of 3 tasks
Tracked by #1074

Upcoming support for new model architectures #1007

frgfm opened this issue Aug 1, 2022 · 3 comments · Fixed by #1443
Assignees
Labels
framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend help wanted Extra attention is needed module: models Related to doctr.models topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition type: new feature New feature
Milestone

Comments

@frgfm
Copy link
Collaborator

frgfm commented Aug 1, 2022

As discussed in several GH issues, docTR could very well welcome new architectures for OCR 👍
Let's use this issue to track this for the next release!

A few things to consider:

  • docTR is not meant to make all architectures available. Let's focus on architectures that are reasonably sized and SOTA performances (or considered as a performance milestone for a given task).
  • it is acceptable to start with the implementation with only 1 DL backend. Although, gradually, within the next releases, full support needs to be added.
  • for faster iterations, training should be performed on synthetic data when available (perf will be pushed on private datasets later once the potential of the architecture is validated). A PR to add implementation for a given architecture should come with the exact args used in training to reproduce the training and the corresponding performances
  • we always have to credit the rightful contributors: papers are always cited in docTR, and providing an implementation is meant not to be a copy paste of another implementation. However is part of the code of someone else is used, that author should be credited ("borrowed from", "inspired by", etc.)

Here is the list of envisioned models:

Text detection

Text recognition

@frgfm frgfm added help wanted Extra attention is needed module: models Related to doctr.models framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition type: new feature New feature labels Aug 1, 2022
@frgfm frgfm added this to the 0.6.0 milestone Aug 1, 2022
@frgfm frgfm mentioned this issue Aug 1, 2022
85 tasks
@felixdittrich92 felixdittrich92 modified the milestones: 0.6.0, 0.7.0 Sep 26, 2022
@SkaarFacee
Copy link
Contributor

I was hoping to be able to help with implementing the PAN model

@felixdittrich92
Copy link
Contributor

If you want you can also work on this but as mentioned we would need to keep it as draft until the next release is done :)

@felixdittrich92
Copy link
Contributor

felixdittrich92 commented Jan 15, 2024

Related: #1425 (TextNet backbone TF / PT) / #1443 FAST

@felixdittrich92 felixdittrich92 linked a pull request Feb 2, 2024 that will close this issue
3 tasks
@felixdittrich92 felixdittrich92 self-assigned this Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend help wanted Extra attention is needed module: models Related to doctr.models topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition type: new feature New feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants