|
Urdu text corpus have the issues of space, Zero width noun joiner, compound words, typological errors , normalization and the errors of affixation. This utility is developed to provide facility to the user to semi-automatically remove these errors and clean the text corpus. This utility helps out the user to navigate text file line by line using next and previous buttons. Separate buttons are provided to User to add/remove Zero Width non-joiner and Urdu special symbols with single button click.
|
|