|
Center for Language Engineering (CLE) is pleased to release testing corpus for English to Urdu machine translation systems. It is highly recommended that this corpus should not be used to train machine translation systems to ensure unbiased evaluation afterwards. The corpus contains 400 English sentences collected from different news papers, including Pakistani, English and American dailies, e.g. Nation, News, Pakistan Times, Dawn, BBC, CNN, NYT, Washington Post, Times, NewsWeek, National Geographic, Economist, etc. The collected sentences are then translated in Urdu by three translators independently.
The work has been
supported by International Development Research Center (IDRC)
of Canada, through PAN Localization project (www.PANL10n.net). |
|