“The web is sad”, said Webgrep

A paper on “Positivity of the English Language”  caught the attention of the main stream media recently. It claimed that English words exhibit a “clear positive bias”. The authors (which happen to be Mathematicians) based this conclusion on their analysis of data from tweets, books, news articles and song lyrics in English. The results showed that people used more positive words than negative ones.  In spite of the fact that frequency based analysis is not exactly new in Computational Linguistics and the spread of this article in the media might be a good example for understanding why some news items receive more attention than others, the experiment is interesting nevertheless and finding out if other languages behave the same might be more intriguing.

While search engines can be used to perform this tasks, words are not the only source of conveying emotions in the digital world. Since the invention of the smiley, emoticons have played a major role in attaching sentiment to online text. There have been studies about utilizing emoticons for detecting sentiment in different text genres but most search engines do not support the query structure which returns results when searching for emoticons. The obvious solution is to have your own snapshot of the web and do a text search but this approach has always required a lot of storage and processing resources. Until now. 

Blekko has introduced a WebGrep service which searches for your text in 4 billion pages but using text patterns instead of words. I will not hesitate to use the words ‘pretty cool’ to describe it. As a simple test, I’ve tried comparing a smiley face with a sad face emotion to gauge the total mood of the web. Sadly [sic], for 142 million sad faces (assuming one sad smiley per page), there are only 104 million happy ones (click here for details). While these statistics can be improved by including variants of existing emoticons, it might be taken as an indication that we need to be more happy.


One way to make that happen is to consciously avoid using sad emoticons when chatting and blogging (as in this post). We can then claim that we are making the web a better place, one smile at a time Smile !

One thought on ““The web is sad”, said Webgrep

  1. Maybe because ppl use sad faces with status like these

    aaj office janay ka mood nai 🙁
    kaal Monday hai 🙁
    pait kharab ho geya hai 🙁
    keya chuss thi yeh :S
    bachi bhagh gai 🙁 ” this might be sad actually :P”
    phir light chalee gai 🙁

    thanks to social networking and unbounded lust of updating status, tweets and “what ya doing right now”, ppl tend to share every single un-related and un-important lines of their lives on the Internet and since they are using social networking tools like facebook, Twitter and google+ they are already victim of low self-esteam hence sad pointless and endless sad emoticons.

Leave a Reply

Your email address will not be published. Required fields are marked *