Configuring Hadoop on Ubuntu in pseudo-distributed mode

Hadoop is an open-source Apache project that enables processing of extremely large datasets in a distributed computing environment. There are three different modes in which it can be run: 1. Standalone Mode2. Pseudo-Distributed Mode3. Fully-Distributed Mode This post covers setting up of Hadoop 2.5.1 in a Pseudo-distributed mode on an Ubuntu machine. For setting up … Continue reading Configuring Hadoop on Ubuntu in pseudo-distributed mode

Git Basics – A cheat sheet for your daily git needs.

This post is for anyone to refer to for their daily git needs. We will not be covering any advanced git concepts here. Git is a distributed version control system.Some basic terminologies:Directory: A folder that contains multiple files.Repository: A directory where Git has been initialized to start version controlling your files. I have created an … Continue reading Git Basics – A cheat sheet for your daily git needs.

Configuring Hadoop on Mac OSx in pseudo-distributed cluster mode.

Hadoop is an open-source Apache project that enables processing of extremely large datasets in a distributed computing environment. There are three different modes in which it can be run: 1. Standalone Mode2. Pseudo-Distributed Mode3. Fully-Distributed Mode This post covers setting up of Hadoop 2.5.1 in a Pseudo-distributed mode. A Pseudo-Distributed mode is one where each … Continue reading Configuring Hadoop on Mac OSx in pseudo-distributed cluster mode.

Multiple Java Installations on OSX and switching between them

Sometimes it may be necessary to have two versions of java on your OSX, and to switch between the two in a fast, reliable and convenient manner. This post aims to show how this can be achieved. I am currently using version 10.9.4 of OSX, and I was required to have both JAVA6 and JAVA7 … Continue reading Multiple Java Installations on OSX and switching between them