Funded by:


Deliverables

Release No.

Due Date

Deliverables

Status


1.
 

31 July, 2012

  • Training Text Image corpus at 14 point size
  • Training Text Image corpus at 16 point size
  • Design Document of Core Urdu OCR Framework using Tesseract
  • Text Corpus Collection, Cleaning and Tagging tools

 Submitted


2.
 

30 November, 2012

  • Training Text image corpus at 20-30 font size
  • Training Text image corpus at 30-44 font size
  • Research Report of Prototype ligature-based OCR for 14 point size.
  • Prototype system of ligature-based OCR for 14 point size

 Submitted


3.
 

31 March, 2013

  • Research Report on Corpus Collection, Design and Release
  • Release of Overall Image Corpus
  • Research Report on Binarization System
  • Binarization System
  • Research Report of Text Area and Figure Identification
  • Text Area and Figure Identification System
  • Research Report of Urdu Word Segmentation System
  • Urdu Word Segmentation System

 Submitted


4.
 

31 July, 2013

  • Research Report of Page Frame Detection System
  • Page Frame Detection System
  • Research Report on Segmentation-based OCR for 14 Point Size
  • Segmentation-based OCR for 14 point size
  • Research Report of Ligature to Word Mapping System
  • Ligature to Word Mapping System
  • Research Report of 14 point size ligature-based OCR
  • Ligature-based OCR for 14 point size

 Submitted


5.
 

30 November, 2013

  • Cleaned and Tagged Urdu Text Corpus
  • Research Report of Noise Removal System
  • Noise Removal System
  • Research Report on segmentation-based  OCR for 16  point size
  • Segmentation-based OCR for 16 point size
  • Research Report of ligature-based OCR for 16 point size
  • Ligature-based OCR System for 16 point size
  • Research Report of Font size independent OCR for 16-24 point sizes
  • Font size independent OCR system for 16-24 point sizes

 Submitted


6.
 

30 April, 2014

  • Research Report on Skew Detection System
  • Skew Detection System
  • Research Report on Nastalique and Latin Script Detection
  • Nastalique and Latin Script Detection system
  • Research Report on Run Marking System
  • Run Marking System
  • Research Report on Text Image Segmentation
  • Text  Image Segmentation System 
  • Research Report on segmentation-based OCR for 24 point size
  • Segmentation-based OCR for 24 point size
  • Research Report on segmentation-based OCR for 36 point size
  • Segmentation-based OCR for 36 point size
  • Research Report on font size independent OCR for 24-36 font sizes
  • Font size independent OCR System for 24- 36 font sizes
  • Research Report  on font size independent OCR for 36-44 font sizes
  • Font size independent OCR for 36-44 font sizes

 Submitted


7.
 

31 August, 2014

  • Complete Urdu OCR System

 Submitted

 

Executed by:







Copyrights: Center for Language Engineering, 2014
webmaster@cle.org.pk