Center for Language Engineering




Al-Khawarizmi Institute of Computer Science

News Details
An Urdu text corpus distribution agreement has been signed between Urdu Digest and Center for Language Engineering, Al-Khawarizmi Institute of Computer Science, University of Engineering and Technology, Lahore at a meeting in Lahore.

Urdu Digest is a leading general-interest Urdu magazine, with a history of 52 years of publication. Its articles and stories cover a range of subjects including education, health, politics, international affairs, sports, business, humor and literature.

In this agreement Urdu Digest has generously contributed its text data, with millions of words from previous publications, to develop the first ever corpus of Urdu in Pakistan. This corpus will be developed and distributed by CLE, KICS-UET to linguists and researchers doing computational work in the areas of speech and language processing. The contribution will also be a significant milestone in the promotion and development of Urdu language.