Center for Language Engineering

 
 



 

 

KICS
KICS-UET


 
 

[ Projects ] [ Publications ] [ Activities ] [ Research Seminars ]

 
   
  Urdu Nastalique Optical Character Recognition System  
   
 
Urdu Nastalique Optical Character Recognition project funded by National ICT R&D aims at developing the system which converts Urdu text image (written in Noori Nastalique style) into computer editable text. This project will focus on recognition of Urdu text books images having font size range between 14 and 44.
http://www.cle.org.pk/ocr
 
 
 
 
Project Details
Start date of project 1st March, 2012
Duration of project 30 Months
Funding agency National Information and Communication Technology Research and Development (ICT R&D) Fund, Pakistan
Principle investigator Dr. Sarmad Hussain
Project status (completed/in progress) In progress
Objectives
  1. To develop Urdu OCR for Nastalique style of writing.
  2. To develop post-processing algorithms in computational linguistics for output generation and error correction of Urdu OCR.
  3. To identify future research directions for graduate research in this area.
  4. To provide access to textual information to print disable communities.
Scope of work
  1. The following character set will be recognized by the Urdu OCR:
    1. Urdu alphabet set including Urdu digits, Urdu aerab and Latin digits.
    2. Other symbols of Urdu, as follows:
      ؁ ؀ ؂ ؃ ؎ ؏ ؐ   ؑ    ؒ   ؓ   ؔ ؟ () ' " ۔ ؛ : ،
  2. The text written in Noori Nastalique font style with font size range between14 and 44 will be recognized.
  3. The system will handle up to 2 columns of text.
  4. Urdu OCR will handle the Latin script written with Times New Roman, Arial and Courier font styles, within the font size range proposed for Urdu.
  5. The system will output plain Urdu text in Unicode format.
(Anticipated) Deliverables
  • OCR for 14 point size
  • OCR for 16 point size
  • OCR for 24 point size
  • OCR for 36 point size
  • Complete Urdu OCR System for 14 to 44 point sizes
Useful links http://www.cle.org.pk/ocr
 
     
     
 

webmaster@cle.org.pk