Skip to content

Tensorflow implementation of OCR (Extracting words from images of the text) using CNNs and BiLSTMs.

Notifications You must be signed in to change notification settings

shreshtashetty/OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

Optical Character Recognition (OCR)

OCR entails recognizing the actual characters of a word from a picture of the word. The network used therefore has convolutional layers as its initial layers which process the image to give feature maps, and bidirectional LSTMs as its later layers which process these feature maps to give actual letters. The loss used is a Connectionist Temporal Classification (CTC) loss, which has been implemented as a separate layer.

This repo is an implementation of the paper: An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition by B Shi et al. https://arxiv.org/pdf/1507.05717.pdf

Dataset

The IAM Dataset is used for the purposes of our task. It can be found here: https://fki.tic.heia-fr.ch/databases/iam-handwriting-database

The dataset used is a lot smaller than that used in the paper, therefore a model as deep as the one mentioned in the paper has not been created.

If you find this repository useful, please cite the following:-

 @misc{Shreshta2021OCR,
   author = {Shetty, Shreshta}, 
   title = {OCR}, 
   year = {2021}, 
   publisher = {GitHub}, 
   journal = {GitHub repository}, 
   howpublished = {\url{https://github.com/shreshtashetty/OCR}},
 }

About

Tensorflow implementation of OCR (Extracting words from images of the text) using CNNs and BiLSTMs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published