Deploying Handwritten Text Recognition Using Tensorflow and CNN

Posted By :Hemant Chauhan |27th April 2020

 

Tensorflow is an open-source platform for machine learning. It is a deep learning framework, we use TensorFlow to build OCR systems for handwritten text, object detection, and number plate recognition. This solves accuracy issues. As a well-positioned AI development company, Oodles AI explores how to build and deploy handwritten text recognition using TensorFlow and CNN from scratch. 

 

Handwritten Text Recognition (HTR) systems power computers to receive and interpret handwritten input from sources such as scanned images. The systems are able to convert handwritten texts into digital text or simply can digitize, store, and extract valuable information for accurate analysis. At Oodles, we use tools like OpenCV and provide TensorFlow development services to build a Neural Network (NN) which is trained on line-images from the off-line HTR dataset.

 

This Neural Network (NN) model split the text written in the scanned image into segmented line images. These line-images are smaller than images of the complete page image. 9/10 of the words of a segmented line from the validation-set are correctly recognized and the character error rate is around 8%.

 

 Image Source:  https://medium.com/apache-mxnet/handwriting-ocr-line-segmentation-with-gluon-7af419f3a3d8

 

The network is made up of 5 CNN and 2 RNN layers and workflow can be divided into 3 steps-

 

1. Create 5 Convolutional Neural Network (CNN ) layers

There are 5 CNN layers. First, the Convolutional layer with 5×5 filter kernels in the first 2 layers Second, the non-linear RELU function is there. Finally, a pooling layer. The output is a feature map.

2. Create a Recurrent neural network (RNN) layers and return its output

Create and stack two RNN layers with 256 units each and a bidirectional RNN from the stacked layers. Get 2 output sequences forward and backward of size 32×256. The output Calculates loss value and also decodes into the final text.


 

Architecture

 

 

 

Image Source: https://medium.com/@arthurflor23/handwritten-text-recognition-using-tensorflow-2-0-f4352b7afe16 

 

3. Create IAM-compatible dataset and train model

The data-loader expects the IAM dataset [5] in the data/ directory. Below are the steps to get dataset:

  1. Register for free at this http://www.fki.inf.unibe.ch/databases/iam-handwriting-database;
  2. Download words/words.tgz and extract
  3. Download ascii/words.txt.
  4. Put words.txt into the data/ directory.
  5. Create the directory data/words/.
  6. Input the content (directories a01, a02, ...) of words.tgz into data/words/.

 

Train the model from scratch

 

To train the model from scratch we go to the src/ directory of our project and execute this command on terminal python main.py --train. After training, validation is done on a validation set (the dataset is split into 95% of the samples used for training and 5% for validation as defined in the class DataLoader). Validation is done by executing the command python main.py –validate. Training on the CPU takes about 30 hours on a normal configuration system.

 

Further Improvements to improve accuracy

Here are some steps to improve accuracy :

  • Add more CNN layers
  • Text correction(Autocorrect spell checker)
  • Increase input size
  • Better Decoding approach to improve accuracy: Use word beam search decoding 
  • Data augmentation: Magnify dataset-size by implementing further (random) transformations to the input images.
  • Remove cursive writing style

 

Requirements

 

  1. Tensorflow 1.8.0
  2. Flask
  3. Numpy
  4. OpenCV 3
  5. Spell Checker Autocorrect

 

Build Handwritten Text Recognition models using TensorFlow With Oodles AI

 

We, at Oodles, have hands-on experience in building and deploying printed and handwritten text recognition using TensorFlow, CNN, OpenCV, and Tesseract frameworks. Our team recently built an AI-powered OCR system to extract critical identity information form Aadhar cards, PAN cards, and other identity documents. The model enables digital businesses to streamline and automate onboarding and identity verification processes with ease and accuracy. 

 

Reach out to our AI development team to learn more about our AI OCR capabilities and projects. 

 


About Author

Hemant Chauhan

Hemant is an accomplished backend developer with extensive experience in software development. He possesses an in-depth understanding of various technologies and has a strong command over Java, Spring Boot, MySQL, Elasticsearch, Selenium with Java, GitHub/GitLab, HTML/CSS, and MongoDB. Hemant has worked on several related projects, including Tessaract OCR, Sikuli with Selenium Automation, Transleqo, and currently, SecureNow. He excels at managing trading bots, developing centralized exchanges, and has a creative mindset with exceptional analytical skills.

Request For Proposal

[contact-form-7 404 "Not Found"]

Ready to innovate ? Let's get in touch

Chat With Us