With the increasing number of “opinion-dispensing apps” which enable Urdu users to write in Unicode out there on the web, there is (or will soon be) a need for getting some meaningful statistics out of the ever-present sentiment of the masses (or at least the web-savvy subset). This calls for resources which enable automatic processing of sentiment, one of which is a sentiment lexicon for Urdu. (For people uninitiated in computational linguistics, a lexicon is just a list of words). Since I couldn’t find any sentiment lexicon available for for Urdu on the tubes, I decided to put in some effort and create a new one.
The Urdu Sentiment Lexicon is a list of 2,607 positive and 4,728 negative sentiment/opinion words for Urdu. It is based on a similar list for English available here. The English words have been translated to Urdu automatically using a dictionary lookup. All resulting Urdu synonyms have been included as well. The lexicon has also been manually inspected (but very quickly) and any irrelevant words have been deleted.