-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding ViTSTR #513
Comments
Hi @felixdittrich92, Thanks for your message, it would be a pleasure having you contributing to the lib! We already have a recognition model including a transformer decoder (MASTER), but we do not have yet full transformer architectures such as ViT or TrOCR. It is on the mid-term road map, and if you would like to propose your implementation you are more than welcome to open a PR! 🙏 Please read the CONTRIBUTING section and feel free to look at the models already implemented in doctr 😃 Thank you and have a nice day 👍 |
i will do thanks :) 👍 |
Hi @felixdittrich92, do you still plan to implement this ? If not, we may close this issue to avoid a huge stack of unaddressed ones! |
Huhu @charlesmindee 👋 , |
ok |
@felixdittrich92 Hi, are there any model weights available for ViTSTR that are compatible with doctr? :) I saw these ones but they seem to be named differently I suppose: https://github.com/roatienza/deep-text-recognition-benchmark/releases |
Adding Vision Transformer for scene text recognition i work currently on this (with huggingface ViT backbone) if i done and have solid results it would be a charme for me to add this model if you interested !? :)
Same for the new unilm/TrOCR model
The text was updated successfully, but these errors were encountered: