LDA-Based Topic Modelling in Javascript: An Update

I’ve just pushed a Javascript version of LDA on my github account. It’s based on my no-longer-functioning earlier work. For testing, I use a subset of the SMS Spam Corpus available here (and thus take no responsibility of the inappropriateness of the text within :) ). Each topic is represented as a word cloud; the larger a word, the more weight it has in the topic. The source sentences are displayed again with a bar which shows the percentage distribution of topics for that sentence. Hovering on each area in the bar would show you the words in the topic. You can of course replace it with any other text, change the number of topics using the slider, and press the ‘Analyse’ button to see it work. ...

April 19, 2014

Do something this summer!

The semester is ending and I am getting lots of emails from my students on how to get the most out of the summer break. So here’s a little list, in no particular order, (which I might keep on expanding later on) outlining some of the things that might make your summer productive. Make a study group: Find some fellow thetas, pick up a tough, interesting book (Knuth anyone?), distribute chapters/topics and teach each other. With the right people, it can not only lead to much geekish fun, but will also help you in the coming semesters (and for the rest of your life) Enroll in an online course: Interested in finding out Archaeology’s Dirty Little Secrets? Want to learn about developing innovative ideas for startups? Register yourself in one (or more) of those free online courses which are offered at websites such as Udacity and Coursera. Most of the content is awesome and you can most definitely find a course of two no matter what your interests are. Make a game: Yes, a proper computer game. Like those arcade things that you (used to?) play in childhood. Pac-Man, Qix, Ludo, Chess… You can find lots of tutorials and course online (see point 2). Challenge yourself and make a game for your cell phone. You know enough programming to do that. All you need is a platform and a good idea! Finish a reading list: Find a book list (or two) and read all those books! The library is your friend (and so are Galaxy, Variety and Readings bookstores) Write something: Start with a sentence. Do it multiple times. Make a paragraph. Do THAT multiple times. Make it a short story. Blow it up into a novel. People write novels in one month, you have two! Do an Internship: Call that uncle (your daddy’s friend) or the bhai jaan (your brother’s friend) who have their own shops/software houses/ factories. Ask them for an internship. Work for a few weeks and see how the world REALLY functions. It will be one of the best lessons you’ll ever have. (Keep watching the university notice board for opportunities) Make some money online: If you are really good at something, there might be people out there who are willing to pay you to work for them online. Find your niche and earn some gadget money! Catchup on programming: So you barely passed your programming course? Well that is over now and it’s probably time to catch up and really learn something yourself by doing some small projects out there. It just might prepare you well enough for the next semester! Who knows! Take a hike, Literally: Do not, (and I repeat, do NOT) miss out the small excursion trips arranged by the university. Or if you don’t like their destination, arrange one yourself. Nothing freshens up the mind more than going to the mountains for a week or two and walking your worries off. Learn an instrument: Anything you always wanted to learn! Oh and coursera has a guitar class going on these days. Start a sport: You don’t have to be really good to play a sport. So what if you can’t hit a yorker or dribble a basket ball without looking. Start doing it regularly and you’ll get good enough to really start enjoying it. Volunteer: Find a good social cause. Volunteer for it. Or teach a working kid how to read/write. (Sadqa-e-jaaria) Learn a new language: Python? Even a kid can learn it and its fun too! Or you can be non-geeky and find a natural language to learn (German? French?). Learn while you can and you won’t regret it later (like I do). Make Art!: Even if you don’t know how to… Get inspired and start making your own stuff. Play around with Paint if you don’t want to get your hands dirty with acrylics and brushes. (All images in this post were created using MS Paint in 2-3 minutes) By personal experience, I can guarantee that most of the things in this list can be done in parallel and can do wonders for your social life :) ...

May 29, 2013

Where does the money go?

Last night, I took a look at the federal budget for 2012-2013. Apparently we will be spending about 25% in “Servicing of Domestic Debt”. Take a more detailed look here

June 22, 2012

Twingual: A twitter client for bilingual tweeple

In my last post, I highlighted some problems that I face daily while using twitter in Urdu as well in English. A few days ago, I decided to experiment with the Twitter API and write my own client to fix some of these problems. You can see the result at www.twingual.com. It is a javascript only twitter client which supports neat Nastaleeq urdu fonts as well as transliteration. It’s a work in progress and does not implement all twitter features. If you like it and want to see something you need everyday implemented, feel free to send a tweet. ...

May 9, 2012

DependenSee: A Dependency Parse Visualisation/Visualization Tool

 There aren’t many tools which allow you to visualise sentences parsed with dependency grammars. Here’s a small tool which generates a PNG of the dependency graph of a given sentence using the Stanford Parser. How to run: Dependency graph shown in the image above for Einey’s quote can be generated by following these steps. Click here to download <dependensee-3.7.0.jar>. Download the latest version of the Stanford Parser. I am using version 3.7.0. Place the jar file in the Stanford Parser folder. On the command prompt, run java -cp dependensee-3.7.0.jar;stanford-parser.jar;stanford-parser-3.6.0-models.jar;slf4j-api.jar com.chaoticity.dependensee.Main "Example isn't another way to teach, it is the only way to teach." out.png ...

August 28, 2010

Making a copy of WEKA Instances

This ‘thing’ took about 30 minutes to figure out. According to the WEKA documentation, if you add a new Instance to an existing Instances object, String values are not transferred ! In case you are working on copying a dataset with a string attribute, you need to transfer the string manually. The code segment below copies the i^th instance from source to dest where the first attribute (at index 0) is a string attribute. ...

April 12, 2010

Google and Urdu Stemming

 Is google (finally) stemming Urdu? The last time I checked, there were doing something like a transliteration based search but in the screenshot below, you can see that searching for the phrase ان پڑھ چٹا shows some stemming is being used. Does anyone know anything? Oh, and while I’m on this topic, I would also like to know why is it called چٹا ان پڑھ ?

March 5, 2010

Visualizing Citation Networks

For techies: I’ve been working on citation networks lately. You can visualize such a network as a graph. In this graph, the nodes represent publications (papers,articles etc) and the edges represent citations between them. The graph above was produced using the GraphViz. The data is from the ACL Anthology Network which contains publications from the publicly available ACL Anthology. For non-techies: Oooooo! pretty picture!

February 4, 2010

Online English to Urdu Translator

While all the online English to Urdu translators that I have seen don’t really work that well (read suck), if we make use the overlapping vocabulary and grammar of Hindi and Urdu along with using Google’s translation API, things come out pretty decent (as mentioned in my previous post). Here’s a small 15 min first cut script which just uses English to Hindi translation and then transliterates from Hindi to Urdu. Feel free to use the code and do ping me if you improve something. This works as a Hindi to Urdu transliterator as well. ...

January 23, 2010

How do you transliterate that?

I am thinking of using google’s English to Hindi translation and hooking it to a Hindi to Urdu transliterator to get an approximate English to Urdu translation. The Hindi to English transliteration provided by google has some errors which might not be there if we convert directly to Urdu. For example, on translating the sentence It can be used in Urdu too, we get the Hindi translation यह उर्दू में इस्तेमाल किया जा सकता है ...

January 21, 2010