Introduction The ability to communicate in one's native language is a fundamental freedom and is closely tied to economic growth and human development because language is one of the key ways to realize cultural diversity. Over the years Internet has become a major source of information sharing. With the increasing use of internet, it is important if we can have information on internet in local languages. It will help to develop the digital literacy through the use local languages, research and development (R&D) in local language computing, and adoption of local language technology by the communities.
There
has been lot of work done in this regard and now it is possible
to have information on internet in local languages. Although the
deployment of contents in local languages has been made possible,
but it still requires the knowledge of Latin script and English
conventions to access them because the Domain Name System (DNS)
is in Latin script and it uses English conventions and abbreviations.
Domain Name is the address which is used to access internet e.g.www.crulp.org
www.crulp.org. A domain name which is capable of encoding languages
written in other than Latin script is called Internationalized Domain
Name (IDN) e.g.ووو۔مرکز تحقیقات اردو۔ادارہ. Internationalized Domain Name
in Applications (IDNA) is a proposed solution for internationalized
domain names. At abstract level we may define it as the layer that
take the domain name in local language as input, normalizes that
input and convert it into DNS compatible form.
A single script may be used for multiple languages. For example there are multiple languages in Pakistan; some of which are spoken across the boundaries. Mostly the languages being spoken in Pakistan use Arabic script but there are variations in the characters being used for these languages.
Purpose
The purpose of this meeting is to discuss particular character set choices for local languages of Pakistan from Unicode Arabic block (U+0600-U+06FF, U+0750-U+077F) with reference to internationalized domain names: (i) The characters required for each language represented will be identified, (ii) Identify confusable characters within a language, (iii) Identify confusable characters across languages, (iv) Identify the character(s) in these languages which are not represented in Unicode.
Acknowlegdement
The
workshop is funded and supported by PAN Localization Project, National University of Computer and Emerging Sciences and Center for Research in Urdu Language Processing
.
|