2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 10,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 4 sold-out performances for that many people to see it.

Click here to see the complete report.


3 step process to uninstall Cygwin from your windows machine

Removing Cygwin from your windows machine can become quite a task at times. Here is the 3 step process which I followed to uninstall Cygwin

On the Command prompt type following commands in order –

1) C:\>takeown /r /d y /f cygwin

Read this command properly; It is takeown & not takedown.

2) icacls cygwin /t /grant Everyone:F

3) rmdir /s /q cygwin

Installing Apache SPARK on windows – Step by step approach

Apache Spark is a general purpose large scale clustering solution which claims to be faster than Hadoop & other HDFS implementations. More theory on Spark can be accessed on the internet.

Here I will focus only on the Installation steps of Apache Spark on Windows

You need JDK1.6+ to proceed with the steps below or in PDF

Step 1: Download & Untar SPARK

Download the version 1.0.2 of Spark from the official website.

Untar the downloaded file to any location (say C:\spark-1.0.2)

Step 2: Download SBT msi (needed for Windows)

Download sbt.MSI & execute it.






You may need to restart the machine so that command line can identify the sbt command

Step 3: Package Spark using SBT

C:\spark-1.0.2>sbt assembly

Note: This step takes enormous amount of time. Please be patient


Step 4: Download SCALA

Spark 1.0.2 needs Scala 2.10. This is extremely important to note. And you can read the README.MD file in the SPARK folder to find the correct scala version needed for your spark.

Download and unzip the scala to any location (say C:\ scala-2.10.1)

Set SCALA_HOME environment variable & set the PATH variable to the bin directory of scala

Verify the scala version (and thus the download)


Step 5: Start the spark shell



Sample program in SPARK

  • Create a data set of 1…10000 integers

              scala> val data = 1 to 10000

  • Use Spark Context to create an RDD [Resilient Distributed Dateset] from that data

              scala> val distData = sc.parallelize(data)

  • Perform a filter mechanism on that data

             scala> distData.filter(_ < 10).collect()





Budget 2014 Top 50 words spoken

budget 2014 word cloud


Recently India’s finance budget 2014 was presented by the Finance Minister. I took the transcipt of the speech from the budget website of Indian government & plotted a bubble chart of top 50 words spoken (of course minus the stop words like I, is, was, them, etc).

Some observations –

1) Government was the top spoken word. It was spoken 71 times (71x)

2) Tax (70x) & Taxes (18x) [Not surprising]

3) Development – 53x

Infrastructure – 33x

Growth – 31x

Investment – 28x

Banks – 24x

Economy – 23x

Coal – 21x

Agriculture –  20x

Manufacture – 18x


Steps to Install Gradle


Gradle is a Groovy-based DSL (versus the traditional XML-based) build automation tool. It makes use of Directed Acyclic Graph (DAG) to determine the order in which tasks are to be run.

For a detailed understanding, please read the wiki entry.

Steps to Install

1) Java 1.5 or above should be installed. Please verify this using the java -version command. If Java is installed, you should see the version; else java is unidentified command message will appear

2) Download the latest binaries from the gradle website.



3) Unzip the gradle-<version>-bin.zip contents to your favorite folder – say – /User/gradle

4) For Mac, modify the PATH variable in .profile file to include gradle.

export PATH=/User/gradle/bin:$PATH

For Windows, modify the PATH variable in the Environment variables to include gradle.

(NOTE: If .profile does not exist, please create one)

5) On the Terminal (or command prompt), type gradle -version

If everything went well, the gradle version should appear.


MongoDB Installation Steps (MongoDB version: 2.4.6)

Recently I installed MongoDB on my mac OS Lion 10.7.5. Here are the steps which I followed

1) Download the latest production release of MongoDB from its site 



2) Un-package the downloaded file to your favorite place – say – /Users/mongo

3) From the Terminal goto /Users/mongo/bin

4) Create the following directory structure


5) Start the mongo server –

./mongod –dbpath data/db

Everything should start properly and the screen should show waiting for mongo shell.

Now you start another terminal & enter following command