First Workshop on Internationalized Domain Names for Pakistani Languages

 
 


Venue: Sunday April 20th, 2008 Center for Research in Urdu Language Processing, FAST-Nuces, Lahore.

 


               

Introduction

The ability to communicate in one's native language is a fundamental freedom and is closely tied to economic growth and human development because language is one of the key ways to realize cultural diversity. Over the years Internet has become a major source of information sharing. With the increasing use of internet, it is important if we can have information on internet in local languages. It will help to develop the digital literacy through the use local languages, research and development (R&D) in local language computing, and adoption of local language technology by the communities.

There has been lot of work done in this regard and now it is possible to have information on internet in local languages. Although the deployment of contents in local languages has been made possible, but it still requires the knowledge of Latin script and English conventions to access them because the Domain Name System (DNS) is in Latin script and it uses English conventions and abbreviations. Domain Name is the address which is used to access internet e.g.www.crulp.org www.crulp.org. A domain name which is capable of encoding languages written in other than Latin script is called Internationalized Domain Name (IDN) e.g.ووو۔مرکز تحقیقات اردو۔ادارہ. Internationalized Domain Name in Applications (IDNA) is a proposed solution for internationalized domain names. At abstract level we may define it as the layer that take the domain name in local language as input, normalizes that input and convert it into DNS compatible form.

A single script may be used for multiple languages. For example there are multiple languages in Pakistan; some of which are spoken across the boundaries. Mostly the languages being spoken in Pakistan use Arabic script but there are variations in the characters being used for these languages.

Purpose

The purpose of this meeting is to discuss particular character set choices for local languages of Pakistan from Unicode Arabic block (U+0600-U+06FF, U+0750-U+077F) with reference to internationalized domain names: (i) The characters required for each language represented will be identified, (ii) Identify confusable characters within a language, (iii) Identify confusable characters across languages, (iv) Identify the character(s) in these languages which are not represented in Unicode.

Acknowlegdement

The workshop is funded and supported by PAN Localization Project, National University of Computer and Emerging Sciences and Center for Research in Urdu Language Processing .

 

webmaster@cle.org.pk