Center for Research in
Urdu Language Processing

 
 


 

 

 

National University of Computer & Emerging Sciences
FAST-NUCES


 
 

[ Projects ] [ Publications ] [ Activities ]

 
   
  Previous Projects  
 

[ Current Projects ] [ Previous Projects ]

 
     
  The projects being previously done at CRULP are:  
       
  Urdu Localization Project

 

 
 

The Urdu localization project envisages bringing the benefits of information age to vast majority of Pakistan which are not literate in English, the lingua franca of Internet, and thus are deprived of the immense possibilities offered by this revolution. It will also usher the Urdu language, the national language of Pakistan spoken and understood by masses, to the information age.
Urdu Localization Project

 
 

 

 
  Urdu Component Development  
 

SpellChecker, Collation and Normalization are basic language utilities. The purpose of this project is to provide APIs for these utilities for Urdu language. SpellChecker utility will check words for spelling errors and will suggest a ranked list of words if a spelling error is found. Collation utility will provide a language sensitive comparison of two strings with respect to sorting. Normalization is a process to convert multiple equivalent representations of data to consistent underlying normal forms.

 
 

TOP

 
 

Lexicon for Urdu Language

 
 

This project aims to develop a lexicon for Urdu language for Nokia. This lexicon will be used for future development of speech and language technology. This project includes the development of a lexicon of commonly used words in Urdu, some domain specific words and proper nouns. The lexicon will also contain basic grammatical and pronunciation information of these words, and will provide almost complete corpus (and language) coverage. The lexicon will be the fundamental building block for other applications in script, speech and language technologies, to be developed in the future, including basic user services (e.g., SMS support, address book) to more advanced user assistance applications (e.g., text-to-speech, speech recognition, spoken language translation and handwriting recognition technologies). Nokia has already indicated that follow-up work on Urdu speech synthesis will be undertaken using this lexicon (based on unit selection technique, which CRULP has not yet done). This project is sponsored by Nokia Research, Beijing, China.

 
 

TOP

 
  Sindhi English Dictionary  
 

The Sindhi English Dictionary comprises of more than 2,000 words taken from Mewaram Parmanand’s Sindhi-English Dictionary (1866-1938, non-copyrighted), which presents Sindhi words in the Sindhi-Arabic script. The word structure for each Sindhi entry consists of variants, synonyms, antonym, part of speech, inflections, grammatical features and English senses. The digital dictionary for Sindhi addresses research and pedagogical needs related to the Sindhi language. It can be used as a basic reference for scholars and students who want to access Sindhi literature on the web. The Sindhi English Dictionary project was supervised by Dr. Sarmad Hussain at National University of Computer and Emerging Sciences, Lahore, Pakistan. The project was developed in collaboration with Dr. Jennifer Cole at University of Illinois Urbana-Champaign and funded by South Asian Language Resource Center (SALRC) at University of Chicago. This dictionary can also be found at University of Chicago website(http://dsal.uchicago.edu/dictionaries/mewaram/). Home page of the Sindhi English Dictionary is
http://www.crulp.org/sed/

 
 

TOP

 
  Nafees Nastaleeq  
 

Nafees Nasta’leeq allows Urdu computing on Microsoft Windows 2000, NT, XP, Unix and Linux platforms. This font enables desktop and internet publishing, and electronic communication in Urdu using existing software (without any plug-in) supporting OTF specifications, e.g. MS Word, MS Excel, MS Outlook (email), Internet Explorer, Netscape Navigator, Mozilla and MS PowerPoint. This font is developed according to calligraphic rules, following the style of Syed Nafees Al-Hussaini (Nafees Raqam), who is one of the finest calligraphers of Pakistan. Guidance and calligraphy of basic glyphs for the font has been provided by Syed Jameel-ur-Rehman. He is the pupil of Syed Nafees Shah and Hafiz Syed Anees-ul-Hassan. Nafees Nasta’leeq OTF contains approximately 1,000 glyphs, including about 26 ligatures. This font is operable on all platforms supporting OTF specifications. This work has been funded by Small Grants Program by IDRC, APDIP UNDP and APNIC.
http://www.apdip.net/projects/ictrnd/2002/nafees/

 
     
  Sindhi English Dictionary  
 

The Sindhi English Dictionary comprises of more than 2,000 words taken from Mewaram Parmanand’s Sindhi-English Dictionary (1866-1938, non-copyrighted), which presents Sindhi words in the Sindhi-Arabic script. The word structure for each Sindhi entry consists of variants, synonyms, antonym, part of speech, inflections, grammatical features and English senses. The digital dictionary for Sindhi addresses research and pedagogical needs related to the Sindhi language. It can be used as a basic reference for scholars and students who want to access Sindhi literature on the web. The Sindhi English Dictionary project was supervised by Dr. Sarmad Hussain at National University of Computer and Emerging Sciences, Lahore, Pakistan. The project was developed in collaboration with Dr. Jennifer Cole at University of Illinois Urbana-Champaign and funded by South Asian Language Resource Center (SALRC) at University of Chicago. This dictionary can also be found at University of Chicago website(http://dsal.uchicago.edu/dictionaries/mewaram/). Home page of the Sindhi English Dictionary is
http://www.crulp.org/sed/

 
 

TOP

 
 

webmaster@crulp.org